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IN THE UNITED STATES PATENT AND TRADEMARK OFFICE 
APPLICANTS : JAMES E. DARNELL, JR. ET AL. 

SERIAL NO. : UNASSIGNED EXAMINER : UNKNOWN 

FILED : HEREWITH ART UNIT : UNKNOWN 

FOR : NUCLEIC ACIDS ENCODING RECEPTOR RECOGNITION 

FACTORS, AND METHODS OF USE THEREOF (AMENDED) 

VIA EXPRESS MAIL: EL 406398184 US 
DATE OF DEPOSIT; JANUARY 70, 2000 

PRELIMINARY AMENDMENT 

ASSISTANT COMMISSIONER FOR PATENTS 
WASHINGTON, D.C. 20231 

Dear Sir: 

In accordance with Rule 1 1 1 of the Rules of Practice please consider the following 
amendments and remarks. 

IN THE SPECIFICATION : 

Page 1, line 6, after "Application" insert - is a Continuation of copending U.S. Serial 
No. 08/948,547, filed October 10, 1997, which is a Division of copending U.S. Serial No. 
08/212,185, filed March 11, 1994 which --; on hne 8, replace "1994" with - 1993, both now 
abandoned, -; on line 9, after "1992" insert -, now abandoned, — ; and line 10, after "1992" 
insert — , now abandoned, — . 

On Page 5, line 27 after the open parenthesis "(" please insert — i.e., the murine 
homologue of the human protein having an amino acid sequence of — . 



On Page 7, line 10 replace "translation protein" with - transcription factor - 
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On Page 9, lines 10, 15, 20, and 25 please replace "seqnece" with — sequence -. 

On Page 17, line 23 after "FIGURE" please replace "1" with - - 1 A-IE - - ; and 
on line 28 after "FIGURE" please replace "2" with - - 2A-2D - 

On Page 18, line 1 after "FIGURE" please replace "3" with - - 3A-3C - -; and 
on line 19 replace "5*' with - 5a-5b ~. 

Page 19, line 4 replace "7" with - 7a-7e --. 

Page 23, line 5, before "the DNA", replace "(B-D)" with -- (B-C) --; line 9, before 
"the DNA", replace "(B-C)" with (B-D) and line 13, before "the DNA", replace "(B-C)" 
with - (B-E) 

On Page 38, lines 4, 10, 14, and 19 please replace "seqnece" with — sequence - 
Page 70, line 4, before "and SEQ ID NO:7", replace "12A-12C" with 13A-13C 
Page 76, line 11, after "EXAMPLE" insert 6 

Applicants request that the Specification be amended to include the Sequence Listing 
submitted herewith at the end of the Specification and prior to the Claims. Applicants 
enclose a copy of the Sequence Listing for the Examiner's convenience. Applicants request 
that the Specification be renumbered as follows: The Sequence Listing should now be on 
numbered pages 93-134. The pages containing the Claims as filed should be renumbered 
from pages 93- 106 to pages 135-148. The page containing the abstract should be 
renumbered from page 107 to page 149, 
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Please amend the title of the application to read: 

NUCLEIC ACIDS ENCODING RECEPTOR RECOGNITION FACTORS AND 
METHODS OF USE THEREOF. 

IN THE CLAIMS : 

Please cancel Claims 2-68 without prejudice: 
Please add the following new claims. 

—69. A recombinant DNA molecule encoding a receptor recognition factor (RRF) protein 
having the following characteristics: 

a) said RRF is cytoplasmic in origin; 

b) said RRF is activated by tyrosine phosphorylation; 

c) upon activation said RRF is translocated to the nucleus of a target cell; and 

d) said RRF has an amino acid sequence comprising a sequence of contiguous amino 
acid residues which is present in both SEQ ID N0:2 and SEQ ID N0:4; wherein the 
sequence of contiguous amino acid residues contains four or more consecutive amino acids. 

70. The recombinant DNA molecule of Claim 69 wherein the sequence of continguous 
amino acid residues contains four or more consecutive amino acids and is selected from the 
group consisting of: 

a) HQLY (amino acids 19-22 of SEQ ID NO:2 and 19-22 of SEQ ID N04); 

b) IRQY (amino acids 31-34 of SEQ ID NO:2 and 30-33 of SEQ ID N04); 

c) RQYL (amino acids 32-35 of SEQ ID NO:2 and 31-34 of SEQ ID N04); 

d) LLQH (amino acids 82-85 of SEQ ID N0:2 and 78-81 of SEQ ID N04); 

e) LQHN (amino acids 83-86 of SEQ ID NO:2 and 79-82 of SEQ ID N04); 

f) RKEV (amino acids 210-213 of SEQ ID N0:2 and 210-213 of SEQ ID N04); 

g) FVVE (amino acids 316-319 of SEQ ID N0:2 and 317-320 of SEQ ID N04); 

h) QPCM (amino acids 321-324 of SEQ ID NO:2 and 322-325 of SEQ ID N04); 

i) PCMP (amino acids 322-325 of SEQ ID NO:2 and 323-326 of SEQ ID N04); 
j) LKTG (amino acids 334-337 of SEQ ID NO:2 and 335-338 of SEQ ID NO:4); 
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k) RLLV (amino acids 345-348 of SEQ ID N0:2 and 346-349 of SEQ ID N0:4); 

I) GFRK (amino acids 372-375 of SEQ ID NO:2 and 376-379 of SEQ ID NO:4); 
m) FRKF (amino acids 373-376 of SEQ ID N0:2 and 377-380 of SEQ ID NO:4); 
n) RKFN (amino acids 374-377 of SEQ ID N0:2 and 378-381 of SEQ ID N0:4); 
o) KFNI (amino acids 375-378 of SEQ ID NO:2 and 379-382 of SEQ ID N0:4); 
p) FNIL (amino acids 376-379 of SEQ ID N0:2 and 380-383 of SEQ ID N0:4); 
q) VTEE (amino acids 424-427 of SEQ ID NO:2 and 426-429 of SEQ ID NO:4); 
r) TEEL (amino acids 425-428 of SEQ ID NO:2 and 427-430 of SEQ ID NO:4); 
s) EELH (amino acids 426-429 of SEQ ID N0:2 and 428-431 of SEQ ID NO:4); 
t) LPVV (amino acids 451-454 of SEQ ID NO:2 and 453-456 of SEQ ID NO:4); 
u) LSWQ (amino acids 500-503 of SEQ ID NO:2 and 502-505 of SEQ ID NO:4); 
v) SWQF (amino acids 501-504 of SEQ ID NO:2 and 503-506 of SEQ ID NO:4); 
w) WQFS (amino acids 502-505 of SEQ ID N0:2 and 504-507 of SEQ ID NO:4); 
X) QFSS (amino acids 503-506 of SEQ ID NO:2 and 505-508 of SEQ ID NO:4); 
y) RGLN (amino acids 510-513 of SEQ ID NO:2 and 512-515 of SEQ ID N0:4); 
z) ILEL (amino acids 560-563 of SEQ ID NO:2 and 561-564 of SEQ ID NO:4); 
zz) LWND (amino acids 571-574 of SEQ ID N0:2 and 572-575 of SEQ ID NO:4); 
aa) WNDG (amino acids 572-575 of SEQ ID NO:2 and 573-576 of SEQ ID N0:4); 
bb) IMGF (amino acids 577-580 of SEQ ID NO:2 and 578-581 of SEQ ID NO:4); 
cc) GTFL (amino acids 596-599 of SEQ ID NO:2 and 597-600 of SEQ ID NO:4); 
dd) TFLL (amino acids 597-600 of SEQ ID NO:2 and 598-601 of SEQ ID NO:4); 
ee) FLLR (amino acids 598-601 of SEQ ID N0:2 and 599-602 of SEQ ID NO:4); 
ff) LLRF (amino acids 599-602 of SEQ ID NO:2 and 600-603 of SEQ ID N0:4); 
gg) LRFS (amino acids 600-603 of SEQ ID NO:2 and 601-604 of SEQ ID NO:4); 
hh) RFSE (amino acids 601-604 of SEQ ID NO:2 and 602-605 of SEQ ID N0:4); 
ii) FSES (amino acids 602-605 of SEQ ID NO:2 and 603-606 of SEQ ID NO:4); 
jj) SESS (amino acids 603-606 of SEQ ID N0:2 and 604-607 of SEQ ID N0:4); 
kk) PYTK (amino acids 630-633 of SEQ ID NO:2 and 633-636 of SEQ ID NO:4); 

II) ENIP (amino acids 654-657 of SEQ ID N0:2 and 657-660 of SEQ ID NO:4); 
mm) NIPE (amino acids 655-658 of SEQ ID NO:2 and 658-661 of SEQ ID NO:4); 
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nn) IPEN (amino acids 656-659 of SEQ ID NO:2 and 659-662 of SEQ ID N0:4); 
CO) PENP (amino acids 657-660 of SEQ ID N0:2 and 660-663 of SEQ ID N0:4); and 
pp) ENPL (amino acids 658-661 of SEQ ID NO:2 and 661-664 of SEQ ID NO:4). 

11. The recombinant DNA molecule of Claim 70 wherein the sequence of contiguous 
amino acid residues contains five or more consecutive amino acids and is selected from the 
group consisting of: 

a) IRQYL (amino acids 3 1-35 of SEQ ID NO:2 and 30-34 of SEQ ID N0:4); 

b) LLQHN (amino acids 82-86 of SEQ ID N0:2 and 78-82 of SEQ ID NO:4); 

c) QPCMP (amino acids 321-325 of SEQ ID NO:2 and 322-326 of SEQ ID NO:4); 

d) GFRKF (amino acids 372-376 of SEQ ID N0:2 and 376-380 of SEQ ID NO:4); 

e) FRKFN (amino acids 373-377 of SEQ ID N0:2 and 377-381 of SEQ ID NO:4); 

f) RKFNI (amino acids 374-378 of SEQ ID NO:2 and 378-382 of SEQ ID NO:4); 

g) KFNIL (amino acids 375-379 of SEQ ID NO:2 and 379-383 of SEQ ID NO:4); 

h) VTEEL (amino acids 424-428 of SEQ ID NO:2 and 426-430 of SEQ ID NO:4); 

i) TEELH (amino acids 425-429 of SEQ ID NO:2 and 427-43 1 of SEQ ID NO:4); 
j) LSWQF (amino acids 500-504 of SEQ ID NO:2 and 502-506 of SEQ ID NO:4); 
k) SWQFS (amino acids 501-505 of SEQ ID NO:2 and 503-507 of SEQ ID NO:4); 
1) WQFSS (amino acids 502-506 of SEQ ID NO:2 and 504-508 of SEQ ID NO:4); 
m) LWNDG (amino acids 571-575 of SEQ ID NO:2 and 572-576 of SEQ ID NO:4); 
n) GTFLL (amino acids 596-600 of SEQ ID NO:2 and 597-601 of SEQ ID N0:4); 
o) TFLLR (amino acids 597-601 of SEQ ID NO:2 and 598-602 of SEQ ID NO:4); 
p) FLLRF (amino acids 598-602 of SEQ ID N0:2 and 599-603 of SEQ ID NO:4); 
q) LLRFS (amino acids 599-603 of SEQ ID N0:2 and 600-604 of SEQ ID N0:4); 

r) LRFSE (amino acids 600-604 of SEQ ID NO:2 and 601-605 of SEQ ID NO:4); 
s) RFSES (amino acids 601-605 of SEQ ID N0:2 and 602-606 of SEQ ID N0:4); 
t) FSESS (amino acids 602-606 of SEQ ID NO:2 and 603-607 of SEQ ID NO:4); 
u) ENIPE (amino acids 654-658 of SEQ ID NO:2 and 657-661 of SEQ ID NO:4); 
v) NIPEN (amino acids 655-659 of SEQ ID N0:2 and 658-662 of SEQ ID NO:4); 
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w) IPENP (amino acids 656-660 of SEQ ID N0:2 and 659-663 of SEQ ID NO:4); 

and 

x) PENPL (amino acids 657-661 of SEQ ID N0:2 and 660-664 of SEQ ID N0;4). 

72. The recombinant DNA molecule of Claim 71 wherein the sequence of contiguous 
amino acid residues contains six or more consecutive amino acids and is selected from the 
group consisting of: 

a) GFRKFN (amino acids 372-377 of SEQ ID N0:2 and 376-381 of SEQ ID NO:4); 

b) FRKFNI (amino acids 373-378 of SEQ ID NO:2 and 377-382 of SEQ ID N0:4); 

c) RKFNIL (amino acids 374-379 of SEQ ID N0:2 and 378-383 of SEQ ID NO:4); 

d) VTEELH (amino acids 424-429 of SEQ ID NO:2 and 426-431 of SEQ ID N0:4); 

e) LSWQFS (amino acids 500-505 of SEQ ID NO:2 and 502-507 of SEQ ID NO:4); 

f) SWQFSS (amino acids 501-506 of SEQ ID N0:2 and 503-508 of SEQ ID N0:4); 

g) GTFLLR (amino acids 596-601 of SEQ ID NO:2 and 597-602 of SEQ ID NO:4); 

h) TFLLRF (amino acids 597-602 of SEQ ID NO:2 and 598-603 of SEQ ID NO:4); 

i) FLLRFS (amino acids 598-603 of SEQ ID N0:2 and 599-604 of SEQ ID NO:4); 
j) LLRFSE (amino acids 599-604 of SEQ ID N0:2 and 600-605 of SEQ ID NO:4); 
k) LRFSES (amino acids 600-605 of SEQ ID NO:2 and 601-606 of SEQ ID NO:4); 
1) RFSESS (amino acids 601-606 of SEQ ID NO:2 and 602-607 of SEQ ID NO:4); 
m) ENIPEN (amino acids 654-659 of SEQ ID N0:2 and 657-662 of SEQ ID N0:4); 
n) NIPENP (amino acids 655-660 of SEQ ID NO:2 and 658-663 of SEQ ID NO:4); 

and 

o) IPENPL (amino acids 656-661 of SEQ ID NO:2 and 659-664 of SEQ ID NO:4). 

73. The recombinant DNA molecule of Claim 72 wherein the sequence of contiguous 
amino acid residues contains seven or more consecutive amino acids and is selected from the 
group consisting of: 

a) GFRKFNI (amino acids 372-378 of SEQ ID N0:2 and 376-382 of SEQ ID NO:4); 

b) FRKFNIL (amino acids 373-379 of SEQ ID N0:2 and 377-383 of SEQ ID N0:4); 

c) LSWQFSS (amino acids 500-506 of SEQ ID NO:2 and 502-508 of SEQ ID NO:4); 
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d) GTFLLRF (amino acids 596-602 of SEQ ID NO:2 and 597-603 of SEQ ID NO:4); 

e) TFLLRFS (amino acids 597-603 of SEQ ID NO:2 and 598-604 of SEQ ID NO;4); 

f) FLLRFSE (amino acids 598-604 of SEQ ID NO:2 and 599-605 of SEQ ID N0:4); 

g) LLRFSES (amino acids 599-605 of SEQ ID N0:2 and 600-606 of SEQ ID N0:4); 

h) LRFSESS (amino acids 600-606 of SEQ ID N0:2 and 601-607 of SEQ ID N0:4); 

i) ENIPENP (amino acids 654-660 of SEQ ID N0:2 and 657-663 of SEQ ID N0:4); 

and 

j) NIPENPL (amino acids 655-661 of SEQ ID NO:2 and 658-664 of SEQ ID N0:4). 

74. The recombinant DNA molecule of Claim 73 wherein the sequence of contiguous 
amino acid residues contains eight or more consecutive amino acids and is selected from the 
group consisting of: 

a) GFRKFNIL (amino acids 372-379 of SEQ ID N0:2 and 376-383 of SEQ ID 

NO:4); 

b) GTFLLRFS (amino acids 596-603 of SEQ ID NO:2 and 597-604 of SEQ ID 

NO:4); 

c) TFLLRFSE (amino acids 597-604 of SEQ ID NO:2 and 598-605 of SEQ ID 

N0:4); 

d) FLLRFSES (amino acids 598-605 of SEQ ID N0:2 and 599-606 of SEQ ID 

NO:4); 

e) LLRFSESS (amino acids 599-606 of SEQ ID NO:2 and 600-607 of SEQ ID 
N0:4); and 

f) ENIPENPL (amino acids 654-661 of SEQ ID N0:2 and 657-664 of SEQ ID N0:4). 

75. The recombinant DNA molecule of Claim 74 wherein the sequence of contiguous 
amino acid residues contains nine or more consecutive amino acids and is selected from the 
group consisting of: 

a) GTFLLRFSE (amino acids 596-604 of SEQ ID NO:2 and 597-605 of SEQ ID 

N0:4); 
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b) TFLLRFSES (amino acids 597-605 of SEQ ID N0:2 and 598-606 of SEQ ID 
N0:4); and 

c) FLLRFSESS (amino acids 598-606 of SEQ ID NO:2 and 599-607 of SEQ ID 

NO:4), 

76. The recombinant DNA molecule of Claim 75 wherein the sequence of contiguous 
amino acid residues contains ten or more consecutive amino acids and is selected from the 
group consisting of: 

a) GTFLLRFSES (amino acids 596-605 of SEQ ID NO:2 and 597-606 of SEQ ID 
NO:4); and 

b) TFLLRFSESS (amino acids 597-606 of SEQ ID NO:2 and 598-607 of SEQ ID 

N0:4). 

77. The recombinant DNA molecule of Claim 76 wherein the sequence of continguous 
amino acid residues contains eleven consecutive amino acids having the sequence 
GTFLLRFSESS (amino acids 596-606 of SEQ ID NO:2 and 597-607 of SEQ ID NO:4). 

78. The recombinant DNA molecule of Claim 70 wherein said RRF has an amino acid 
sequence which further comprises a second sequence of contiguous amino acid residues, 
wherein the second sequence of contiguous amino acid residues also contains four or more 
consecutive amino acids which is present in both SEQ ID NO:2 and SEQ ID NO:4. 

79. A recombinant DNA molecule encoding a receptor recognition factor (RRF) protein 
having the following characteristics: 

a) said RRF is cytoplasmic in origin; 

b) said RRF is activated by tyrosine phosphorylation; and 

c) upon activation said RRF is translocated to the nucleus of a target cell, wherein 
said DNA molecule hybridizes to the nucleotide sequence set forth in SEQ ID NO:l under 
standard hybridization conditions. 
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80. A recombinant DNA molecule encoding a receptor recognition factor (RRF) protein 
having the following characteristics: 

a) said RRF is cytoplasmic in origin; 

b) said RRF is activated by tyrosine phosphorylation; and 

c) upon activation said RRF is translocated to the nucleus of a target cell; wherein 
said DNA molecule hybridizes to the nucleotide sequence set forth in SEQ ID NO:3 under 
standard hybridization conditions. 

81. A recombinant DNA molecule encoding a receptor recognition factor (RRF) protein 
having the following characteristics: 

(a) the RRF is cytoplasmic in origin; 

(b) the RRF is activated by tyrosine phosphorylation; and 

(c) upon activation said RRF is translocated to the nucleus of a target cell; 
wherein the RRF contains one or more of the boxed regions in Figure 8B. 

82 The recombinant DNA molecule of Claim 81, wherein the RRF further contains a 
tyrosyl residue at a position that corresponds to the conserved position identified in SEQ ID 
NO:2 and SEQ ID NO:4, said position selected from the group consisting of: 

amino acid 22 of SEQ ID N0:2 and amino acid 22 of SEQ ID N0:4; 

amino acid 34 of SEQ ID NO:2 and amino acid 33 of SEQ ID N0:4; 

amino acid 288 of SEQ ID N0:2 and amino acid 289 of SEQ ID N0:4; 

amino acid 631 of SEQ ID NO:2 and amino acid 634 of SEQ ID N0;4; 

amino acid 648 of SEQ ID NO:2 and amino acid 651 of SEQ ID NO:4; 

amino acid 665 of SEQ ID N0:2 and amino acid 668 of SEQ ID N0:4; 

amino acid 677 of SEQ ID NO:2 and amino acid 680 of SEQ ID NO:4; 

amino acid 678 of SEQ ID NO:2 and amino acid 681 of SEQ ID NO:4; and 

amino acid 690 of SEQ ID NO:2 and amino acid 701 of SEQ ID NO:4. 
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83. The recombinant DNA molecule of Claim 81 wherein the RRF comprises a highly 
negative charged domain at its C-terminal end. 

84. The recombinant DNA molecule of Claim 81 wherein the RRF comprises an SH2 
domain. 

85. The recombinant DNA molecule of Claim 84 wherein the SH2 domain contains an 
arginine at a position that corresponds to amino acid 601 of SEQ ID NO: 2 and amino acid 
602 of SEQ ID NO:4, 

86. The recombinant DNA molecule of Claim 81 wherein the RRF forms a dimer upon 
said activation by tyrosine phosphorylation, 

87. The recombinant DNA molecule of Claim 81 wherein the activation of the RRF is 
unaffected by the presence or concentration of second messengers. 

88. The recombinant DNA molecule of Claim 81 wherein the RRF can act as a DNA 
binding protein upon said activation by tyrosine phosphorylation. 

89. The recombinant DNA molecule of Claim 81 wherein the RRF interacts with an 
interferon- Y -bound receptor kinase complex. 

90. The recombinant DNA molecule of Claim 88 wherein the RRF can stimulate ISRE- 
dependent or gamma activated site (GAS)-dependent transcription. 

91. An isolated nucleic acid encoding a receptor recognition factor (RRF) protein 
having the following characteristics: 

(a) the RRF is cytoplasmic in origin; 

(b) the RRF is activated by tyrosine phosphorylation; and 
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(c) upon activation said RRF is translocated to the nucleus of a target cell; 
wherein the RRF contains one or more of the boxed regions in Figure 8B. 

92. The isolated nucleic acid of Claim 91, wherein the RRF further contains a tyrosyl 
residue at a position that corresponds to the conserved position identified in SEQ ID NO: 2 
and SEQ ID N0:4, said position selected from the group consisting of: 

amino acid 22 of SEQ ID NO:2 and amino acid 22 of SEQ ID NO:4; 
amino acid 34 of SEQ ID NO:2 and amino acid 33 of SEQ ID N0:4; 
amino acid 288 of SEQ ID N0:2 and amino acid 289 of SEQ ID NO:4; 
amino acid 631 of SEQ ID N0:2 and amino acid 634 of SEQ ID NO:4; 
amino acid 648 of SEQ ID N0:2 and amino acid 651 of SEQ ID N0:4; 
amino acid 665 of SEQ ID N0:2 and amino acid 668 of SEQ ID N0:4; 
amino acid 677 of SEQ ID N0:2 and amino acid 680 of SEQ ID NO:4; 
amino acid 678 of SEQ ID N0:2 and amino acid 681 of SEQ ID NO:4; and 
amino acid 690 of SEQ ID NO:2 and amino acid 701 of SEQ ID N0:4. 

93. The recombinant DNA molecule of Claim 81 that is operatively linked to an 
expression control sequence. 

94. An expression vector containing the recombinant DNA molecule of Claim 93. 

95 . A method of expressing a recombinant receptor recognition factor in a cell 
containing the expression vector of Claim 94 comprising culturing the cell in an appropriate 
cell culture medium under conditions that provide for expression of the receptor recognition 
factor by the cell. 

96. The method of Claim 95 further comprising the step of purifying the recombinant 
receptor recognition factor.- 
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REMARKS 



The Specification has been amended to incorporate references to the appropriate SEQ 
ID NOs:, where specific sequences are indicated. Applicants have amended the Specification 
to note the status of the parent appUcations, above and have also addressed the issues 
concerning 37 CFR 1.821(c). The Specification has been amended as described above, in 
order to correct obvious typographical errors. 

At the outset. Applicants bring to the Examiner's attention the NOTICE OF 
INCOMPLETE APPLICATION issued on 4/20/94 in the co-pending parent application, 
Serial No. 08/212,185. Applicants responded on 5/26/94 with a PETITION FOR FILING 
DATE UNDER 37 C.F.R. § 1.53(b) which was favorably received in the DECISION ON 
PETITION issued by the Special Program Examiner on August 8, 1994. Copies of these 
papers are included in the present filing. 

Support for the newly added claims can be found throughout the present Specification 
as indicated below, and more specifically in the Specification as originally filed on March 19, 
1992, see lines 1-21 on Page 37 of the present Specification. Support for Claims 69-80 as 
related to the properties of the claimed RRFs may be found on lines 18-29 of Page 4; on lines 
1-12 of Page 8; on lines 13-24 of Page 12; and throughout the first three Examples. Further 
support to the claimed sequence homology between SEQ ID NOs:2 and 4 may be found on 
lines 8-10, and 15-22 on Page 6; on lines 3-5 of Page 20; and throughout Figure 8b. Further 
support for Claims 79-80 may be found on lines 15-18 of Page 6; Claims 14 and 15; lines 25- 
27 on Page 32; on lines 6-8 of Page 35; and throughout the first three Examples. Further 
support for Claims 81 and 99 may be found in the Specification on line 18 of Page 4 through 
line 3 of Page 5, on lines 1-12 of Page 8, on lines 13-24 of Page 12, on lines 3-10 of Page 21, 
and in Figure 8B, on lines 10-30 of Page 35, and throughout the Examples. Further support 
for Claims 82 and 92 may be found on lines 8-10 of Page 21, and in Figure 8B of the 
Specification, Further support for Claim 83 may be found on lines 28-30 of Page 20 of the 
Specification. Further support for Claims 84-86 may be found in the Specification on lines 
1-8 of Page 5, on lines 22-29 of Page 7, on lines 4-18 of Page 26, line 4 of Page 83 through 
line 27 of Page 87, and in Figures 19-23. Further support for Claim 87 may be found on 
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Page 5, lines 10-14, and Page 7, lines 1-2 of the Specification. Further support for Claims 
88-90 may be found in the Specification on Page 7, lines 9-20, on Page 8, lines 1-12, on Page 
24, lines 23-29, on Page 36, lines 7-24, and in Figure 18. Further support for Claims 93-96 
may be found in the Specification on Unes 2-7, and 14-26, of Page 10, on lines 5-12 of Page 
11, and on lines 15-18 of Page 16. Claims 1, and 69-96 remain for consideration. 

No additional fees are believed to be necessitated by the foregoing amendments. 
However, should this be erroneous, authorization is hereby given to charge Deposit Account 
No. 1 1-1 153 for any underpayment, or credit any overages. 

Applicants respectfully request entry of the foregoing amendment into the file history 
of the above-identified Application being filled herewith. Early and favorable action on the 
pending set of Claims is earnestly solicited. 

Respectfully submitted, 

MICHAEL D. DAVIS 
Attorney for Applicant(s) 
Registration No. 39,161 

KLAUBER & JACKSON 
411 Hackensack Avenue 
Hackensack, New Jersey 07601 
(201) 487-5800 
Date: January _lO, 2000 



-13- 



RECEPTOR RECOGNITION FACTORS, PROTEIN SEQUENCES 
AND METHODS OF USE THEREOF 



CROSS-REFER ENCE TO RELATED APPLICATIONS 

5 

The present Application is a Continuation-In-Part of copending U.S. Serial No, 
08/126,588 and copending U.S. Serial No. 08/126,595, both filed September 24, 
1994, which are both Continuations-In-Part of copending U.S. Serial No. 
07/980,498, filed November 23, 1992, which is a Continuation-In-Part of 
10 copending U.S. Serial No, 07/854,296, filed March 19, 1992, the disclosures of 
which are hereby incorporated by reference in their entireties. Applicants claim 
the benefits of these Applications under 35 U.S.C. § 120. 

RELATED PUBLICATIONS 

15 

The Applicants are authors or co-authors of several articles directed to the subject 
matter of the present invention. (1) Darnell et al.,"Interferon-Dependent 
Transcriptional Activation: Signal Transduction Without Second Messenger 
Involvement?" THE NEW BIOLOGIST . 2(10) : 1-4. (1990); (2) X. Fu et al,, 

20 "ISGF3, The Transcriptional Activator Induced by Interferon a, Consists of 
Multiple Interacting Polypeptide Chains" PROC. NATL. ACAD. SCL USA . 
S7:8555-8559 (1990); (3) D.S. Kessler et al., "IFNa Regulates Nuclear 
Translocation and DNA-Binding Affinity of ISGF3, A Multimeric Transcriptional 
Activator" GENES AND DEVELOPMENT . 4:1753 (1990), AU of the above 

25 listed articles are incorporated herein by reference, 

TECHNICAL FIELD OF THE INVENTION 

The present invention relates generally to intracellular receptor recognition 
30 proteins or factors(i.e. groups of proteins), and to methods and compositions 

including such factors or the antibodies reactive toward them, or analogs thereof in 
assays and for diagnosing, preventing and/or treating cellular debilitation, 
derangement or dysfunction. More particularly, the present invention relates to 



2 

particular BFN-dependent receptor recognition molecules that have been identified 
and sequenced, and that demonstrate direct participation in intracellular events, 
extending from interaction with the liganded recq>tor at tiie cell surface to 
transcription in the nucleus, and to antibodies or to other entities specific thereto 
5 that may thereby selectively modulate such activity in mammalian cells. 

BACKGROUND OF THE INVENTION 

There are several possible pathways of signal transduction that might be followed 
10 after a polypeptide ligand binds to its cognate cell surface receptor. Within 

minutes of such ligand-receptor interaction, genes tiiat were previously quiescent 
are rapidly described (Murdoch et al., 1982; Lamer et al., 1984; Friedman et 
al., 1984; Greenberg and Ziff, 1984; Greenberg et al., 1985). One of the most 
physiologically important, yet poorly understood, aspects of these immediate 
15 transcriptional responses is their specificity: the set of genes activated, for 

example, by platelet-derived growth factor (PDGF), does not completely overlap 
witii the one activated by nerve growtii factor (NGF) or tumor necrosis factor 
(TNF) (Cochran et al., 1983; Greenberg et al., 1985; Almendral et al., 1988; Lee 
et al., 1990). The interferons (IFN) activate sets of other genes entirely. Even 
20 IFNa and IFN7, whose presence results in the slowing of ceU growth and in an 
increased resistance to viruses (Tamm et al., 1987) do not activate exactiy tiie 
same set of genes (Lamer et al., 1984; Friedman et al., 1984; CeUs et al., 1987, 
1985; Lamer et al., 1986). 

25 The current hypotiieses related to signal transduction patiiways in tiie cytoplasm do 
not adequately explain tiie high degree of specificity observed in polypeptide- 
dependent ti^scriptional responses. The most commonly discussed patiiways of 
signal ti^sduction tiiat might ultimately lead to tiie nucleus depend on properties 
of cell surface receptors containing tyrosine kinase domains [for example, PDGF, 

30 epidermal growtii factor (EGF), colony-stimulating factor (CSF), insulin-like 

growtii factor-1 (IGF-1); see Gill, 1990; Hunter, 1990) or of receptors tiiat interact 
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with G-proteins (Gilman, 1987), These two groups of receptors mediate changes 
in the intracellular concentrations of second messengers that, in turn, activate one 
of a series of protein phosphokinases, resulting in a cascade of phosphorylations 
(or dephosphorylations) of cytoplasmic proteins. 

5 

It has been widely conjectured that the cascade of phosphorylations secondary to 
changes in intracellular second messenger levels is responsible for variations in the 
rates of transcription of particular genes (Bourne, 1988, 1990; Berridge, 1987; 
Gill, 1990; Hunter, 1990). However, there are at least two reasons to question 
10 the suggestion that global changes in second messengers participate in the chain of 
events leading to specific transcriptional responses dependent on specific receptor 
occupation by polypeptide ligands. 

First, there is a limited number of second messengers (cAMP, diacyl glycerol, 

15 phosphoinositides, and Ca^'^ are the most prominently discussed), whereas the 

number of known cell surface receptor-ligand pairs of only the tyrosine kinase and 
G-protein varieties, for example, already greatly outnumbers the list of second 
messengers, and could easily stretch into the hundreds (Gill, 1990; Hunter, 1990). 
In addition, since many different receptors can coexist on one cell type at any 

20 instant, a cell can be called upon to respond simultaneously to two or more 
different ligands with an individually specific transcriptional response each 
involving a different set of target genes. Second, a number of receptors for 
polypeptide ligands are now known that have neither tyrosine kinase domains nor 
any structure suggesting interaction with G-proteins. These include the receptors 

25 for interleukin-2 (IL-2) (Leonard et al., 1985), BFNa (Uze et al., 1990), IFN7 
(Aguet et al., 1988), NGF (Johnson et al,, 1986), and growth hormone (Leung et 
al,, 1987). The binding of each of these receptors to its specific ligand has bean 
demonstrated to stimulate transcription of a specific set of genes. For these 
reasons it seems unlikely that global intracellular fluctuations in a limited set of 

30 second messengers are integral to the pathway of specific, polypeptide ligand- 
dependent, immediate transcriptional responses. 
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In PCT International PubUcation No. WO 92/08740 pubUshed 29 May, 1992 by 
the ^plicant herein, the above analysis was presented and it was discovered and 
proposed that a recqptor recognition factor or factors, served in some capacity as a 
type of direct messenger between liganded receptors at the cell surface and the cell 
5 nucleus. One of the characteristics that was ascribed to the receptor recognition 
factor was its apparent lack of requirement for changes in second messenger 
concentrations. Continued investigation of the receptor recognition factor through 
study of the actions of the interferons IFNa and IFN7 has further elucidated the 
characteristics and structure of the interferon-related factor ISGF-3, and more 
10 broadly, the characterization and structure of the receptor recognition factor in a 
manner that extends beyond earlier discoveries previously described. It is 
accordingly to the presentation of this updated characterization of the receptor 
recognition factor and the materials and methods both diagnostic and therapeutic 
corresponding thereto that the present disclosure is directed. 

15 

SUMMARY OF THE INVENTION 



In accordance with the present invention, receptor recognition factors have been 
further characterized that appear to interact directly with receptors that have been 

20 occupied by their ligand on cellular surfaces, and which in turn either become 
active transcription factors, or activate or directly associate with transcription 
factors that enter the cells' nucleus and specifically binds on predetermined sites 
and thereby activates the genes. It should be noted that the receptor recognition 
proteins thus possess multiple properties, among them: 1) recognizing and being 

25 activated during such recognition by receptors; 2) being translocated to the nucleus 
by an inhibitable process (eg. NaF inhibits translocation); and 3) combining with 
transcription activating proteins or acting themselves as transcription activation 
proteins, and that all of these properties are possessed by the proteins described 
herein. 
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A further property of the receptor recognition factors (also termed herein signal 
transducers and activators of transcription - STAT) is dimerization to form 
homodimers or heterodimers upon activation by phosphorylation of tyrosine. In a 
specific embodiment, infra, Stat91 and Stat84 form homodimers and a Stat9i- 
5 Stat84 heterodimer. Accordingly, the present invention is directed to such dimers, 
which can form spontaneously by phophorylation of the STAT protein, or which 
can be prepared synthetically by chemically cross-linking two like or unlike STAT 
proteins, 

10 The receptor recognition factor is proteinaceous in composition and is believed to 
be present in the cytoplasm, - The recognition factor is not demonstrably affected 
by concentrations of second messengers, however does exhibit direct interaction 
with tyrosine kinase domains, although it exhibits no apparent interaction with G- 
proteins. More particularly, as is shown in a co-pending, co-owned application 

15 entitled '*INTERFERON-ASSOCIATED RECEPTOR RECOGNITION 

FACTORS, NUCLEIC ACIDS ENCODING THE SAME AND METHODS OF 
USE THEREOF," filed on even date herewith, the 91 kD human interferon 
(IFN) -7 factor, represented by SEQ ID N0:4 directly interacts with DNA after 
acquiring phosphate on tyrosine located at position 701 of the amino acid 

20 sequence. 

The recognition factor is now known to comprise several proteinaceous 
substituents, in the instance of IFNa and IFN7. Particularly, three proteins 
derived from the factor ISGF-3 have been successfully sequenced and their 
25 sequences are set forth in FIGURE 1 (SEQ ID N0S:1, 2), FIGURE 2 (SEQ ID 
NOS:3, 4) and FIGURE 3 (SEQ. ID NOS.5, 6) herein. Additionally, a murine 
gene encoding the 91 kD protein (SEQ ID NO:4) has been identified and 
sequenced. The nucleotide sequence (SEQ ID NO:7) and deduced amino acid 
sequence (SEQ ID NO:8) are shown in FIGURE 13A-13C. 
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In a further embodiment, mxirine genes encoding homologs of the recognition 
factor have been succefully sequenced and cloned into plasmids. A gene in 
plasmid 13sfl has the nucleotide sequence (SEQ ID NO:9) and deduced amino 
acid sequence (SEQ ID NO: 10) as shown in FIGURE 14A-14C. A gene in 
5 plasmid 19sf6 has the nucleotide sequence (SEQ ID NO: 11) and deduced anuno 
acid sequence (SEQ ID NO: 12) shown in HGURE 15A-15C. 

It is particularly noteworthy that the protein sequence of FIGURE 1 (SEQ ID 
NO:2) and the sequence of the proteins of RGURES 2 (SEQ ID NO:4) and 3 

10 (SEQ ID NO:6) derive, respectively, from two different but related genes. 

Moreover, the protein sequence of FIGURE 13 (SEQ ID'N0:8) derives from a 
murine gene that is analogous to the gene encoding the protein of FIGURE 2 (SEQ 
ID NO:4). Of further note is that the protein sequences of FIGURES 14 (SEQ ID 
NO: 10) and 15 (SEQ ID NO: 12) derive from two genes that are different from, 

15 but related to, the protein of FIGURE 13 (FIG ID NO:8). It is clear from these 
discoveries that a family of genes exists, and that further family members likewise 
exist. Accordingly, as demonstrated herein, by use of hybridization techniques, 
additional such family members will be found. 

20 Further, the capacity of such family members to function in the manner of the 
receptor recognition factors disclosed, herein may be assessed by determining 
those ligand that cause the phosphorylation of the particular family members. 

In its broadest aspect, the present invention extends to a receptor recognition 
25 factor implicated in the transcriptional stimulation of genes in target cells in 

response to the binding of a specific polypeptide ligand to its cellular receptor on 
said target cell, said receptor recognition factor having the following 
characteristics: 

a) apparent direct interaction with the ligand-bound receptor complex 
30 and activation of one or more transcription factors capable of binding with a 
specific gene; 
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b) an activity demonstrably unaffected by the presence or concentmtioE 
of second messengers; 

c) direct interaction with tyrosine kinase domains; and 

d) a perceived absence of interaction with G-proteins. 

5 

In a further aspect, the receptor recognition (STAT) protein forms a dimer upon 
activation by phosphorylation. 

In a specific example, the receptor recognition factor represented by SEQ ID 
10 NO:4 possesses the added capability of acting as a translation protein and, in 

particular, as a DNA binding- protein in response to interfferon-7 stimulation. This 
discovery presages an expanded role for the proteins in question, and other 
proteins and like factors that have heretofore been characterized as receptor 
recognition factors. It is therefore apparent that a single factor may indeed 
15 provide the nexus between the liganded receptor at the cell surface and direct 

participation in DNA transcriptional activity in the nucleus. This pleiotypic factor 
has the following characteristics: 

a) . It interacts with an interferon-7-bound receptor kinase complex; 

b) It is a tyrosine kinase substrate; and 

20 c) When phosphorylated, it serves as a DNA binding protein. 

More particularly, the factor represented by SEQ ID NO: 4 is interferon-dependent 
in its activity and is responsive to interferon stimulation, particularly that of 
interferon-7. It has further been discovered that activation of the factor 
25 represented by SEQ ID NO: 4 requires phosphorylation of tyrosine-701 of the 
protein, and further still that tyrosine phosphorylation requires the presence of a 
functionally active SH2 domain in the protein. Preferably, such SH2 domain 
contains an amino acid residue corresponding to an arginine at position 602 of the 
protein. 
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In a still further aspect, the present invention extends to a receptor recognition 
factor interactive with a liganded interferon receptor, which rec^tor recognition 
factor possesses the following characteristics: 
a) it is present in cytoplasm; 
5 b) it undergoes tyrosine phosphorylation upon treatment of cells with JFNa 

or IFN7; 

c) it activates transcription of an interferon stimulated gene; 

d) it stimulates either an ISRE-dependent or a gamma activated site 
(GAS)-dependent transcription in vivo; 

10 e) it interacts with IFN cellular receptors, and 

f) it undergoes nuclear translocation upon stimulation of the IFN cellular 
receptors with IFN. 

The factor of the invention represented by SEQ ID NO:4 appears to act in similar 
15 fashion to an earlier determined site-specific DNA binding protein that is 

interferon-7 dependent and that has been earlier called the 7 activating factor 
(GAP)* Specifically, interferon-7-dependent activation of this factor occurs 
without new protein synthesis and appears within minutes of interferon-7 
treatment, achieves maximum extent between 15 and 30 minutes thereafter, and 
20 then disappears after 2-3 hours. These further characteristics of identification and 
action assist in the evaluation of the present factor for applications having both 
diagnostic and therapeutic significance. 

In a particular embodiment, the present invention relates to all members of the 
25 herein disclosed family of receptor recognition factors except t|ie 91 kD protein 
factors, specifically the proteins whose sequences are represented by one or more 
of SEQ ID N0:4, SEQ ID N0:6 or SEQ ID N0:8. 
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The present invention also relates to a recombinant DNA molecule or cloned gene, 
or a degenerate variant thereof, which encodes a receptor recognition factor, or a 
fragment thereof, that possesses a molecular weight of about 113 kD and an amino 
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acid sequence set forth in FIGURE 1 (SEQ ID N0:2); preferably a nucleic acid 
molecule, in particular a recombinant DNA molecule or cloned gene, encoding the 
113 kD receptor recognition factor has a nucleotide sequence or is complementary 
to a DNA sequence shown in FIGURE I (SEQ ID N0:1). In another 

5 embodiment, the receptor recognition factor has a molecular weight of about 91 
kD and the amino acid sequence set forth in HGURE 2 (SEQ ID NO:4) or 
FIGURE 13 (SEQ ID NO: 8); preferably a nucleic acid molecule, in particular a 
recombinant DNA molecule or cloned gene, encoding the 91 kD receptor 
recognition factor has a nucleotide sequence or is complementary to a DNA 

10 seqnece shown in FIGURE 2 (SEQ ID NO:3) or FIGURE 13 (SEQ ID N0:8). In 
yet a further embodiment, the receptor recognition factor4ias a molecular weight 
of about 84 kD and the amino acid sequence set forth in FIGURE 3 (SEQ ID 
NO: 6); preferably a nucleic acid molecule, in particular a recombinant DNA 
molecule or cloned gene, encoding the 84 kD receptor recognition factor has a 

15 nucleotide sequence or is complementary to a DNA seqnece shown in FIGURE 3 
(SEQ ID NO:5). In yet another embodiment, the receptor recognition factor has 
an amino acid sequence set forth in FIGURE 14 (SEQ ID NO: 10); preferably a 
nucleic acid molecule, in particular a recombinant DNA molecule or cloned gene, 
encoding such receptor recognition factor has a nucleotide sequence or is 

20 complementary to a DNA seqnece shown in FIGURE 14 (SEQ ID N0:9). In stiU 
another embodiment, the receptor recognition factor has an amino acid sequence 
set forth in FIGURE 15 (SEQ ID NO: 12); preferably a nucleic acid molecule, in 
particular a recombinant DNA molecule or cloned gene, encoding such receptor , 
recognition factor has a nucleotide sequence or is complementary to a DNA 

25 seqnece shown in HGURE 15 (SEQ ID NO: 11). . 

The human and murine DNA sequences of the receptor recognition factors of the 
present invention or portions thereof, may be prepared as probes to screen for 
complementary sequences and genomic clones in the same or alternate species. 
30 The present invention extends to probes so prepared that may be provided for 
screening cDNA and genomic libraries for the receptor recognition factors. For 
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example, the probes may be prepared with a variety of known vectors, such as the 
phage X vector. The present invention also includes the preparation of plasmids 
including such vectors, and the use of the DNA sequences to construct vectors 
expressing antisense KNA or ribozymes which would attack the mRNAs of any or 
5 aU of the DNA sequences set forth in FIGURES 1, 2, 3, 13, 14 and 15 (SEQ ID 
NOS:l, 3, 5, 7, 9, and 11, respectively). Correspondingly, the preparation of 
antisense RNA and ribozymes are included herein. 

The present invention also includes receptor recognition factor proteins having the 
10 activities noted herein, and that display the amino acid sequences set forth and 
described above and selected from SEQ ID NO:2, SEQ N0:4, SEQ ID NO:6, 
SEQ ID NO:8, SEQ ID NO: 10 and SEQ ID NO: 12. 

In a further embodiment of the invention, the full DNA sequence of the 
15 recombinant DNA molecule or cloned gene so determined may be operatively 
linked to an expression control sequence which may be introduced into an 
appropriate host. The invention accordingly extends to unicellular hosts 
transformed with the cloned gene or. recombinant DNA molecule comprising a 
DNA sequence encoding the present receptor recognition factor(s), and more 
20 particularly, the complete DNA sequence determined from the sequences set forth 
above and in SEQ ID NO:l, SEQ ID NO:3, SEQ ID N0:5, SEQ ID N0:7, SEQ 
ID N0:9 and SEQ ID NO: 11. 

According to other preferred features of certain preferred embodiments of the 
25 present invention, a recombinant expression system is provided to produce 
biologically active animal or human receptor recognition factor. 

The concept of the receptor recognition factor contemplates that specific factors 
exist for correspondingly specific ligands, such as tumor necrosis factor, nerve 
30 growth factor and the like, as described earlier. Accordingly, the exact structure 
of each receptor recognition factor will understandably vary so as to achieve this 
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ligand and activity specificity. It is this specificity and the direct involvement of 
the receptor recognition factor in the chain of events leading to gene activation, 
that offers the promise of a broad spectrum of diagnostic and therapeutic utilities. 

5 The present invention naturally contemplates several means for preparation of the 
recognition factor, including as iUustrated herein known recombinant techniques, 
and the invention is accordingly intended to cover such synthetic preparations 
within its scope. The isolation of the cDNA amino acid sequences disclosed 
herein facilitates the reproduction of the recognition factor by such recombinant 
10 techniques, and accordingly, the invention extends to expression vectors prepared 
from the disclosed DNA sequences for expression in host systems by recombinant 
DNA techniques, and to the resulting transformed hosts. 

The invention includes an assay system for screening of potential drugs effective to 
15 modulate transcriptional activity of target mammalian cells by interrupting or 

potentiating the recognition factor or factors. In one instance, the test drug could 
be administered to a cellular sample with the ligand that activates the receptor 
recognition factor, or an extract containing the activated recognition factor, to 
determine its effect upon the binding activity of the recognition factor to any 
20 chemical sample (including DNA), or to the test drug, by comparison with a 
control. 



The assay system could more importantiy be adapted to identify drugs or other - 
entities that are capable of binding to the receptor recognition and/or transcription 

25 factors or proteins, either in the cytoplasm or in the nucleus, thereby inhibiting or 
potentiating transcriptional activity. Such assay would be useful in the 
development of drugs that would be specific against particular cellular activity, or 
that would potentiate such activity, in time or in level of activity. For example, 
such drugs might be used to modulate cellular response to shock, or to treat other 

30 pathologies, as for example, in making IFN more potent against cancer. 
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In yet a further embodiment, the invention contemplates antagonists of the activity 
of a receptor recognition factor (STAT), In particular, an agent or molecule that 
inhibits dimerization (homodimerization or heterodimerization) can be used to 
block transcription activation effected by an acitvated, phosphorylated STAT 
5 protein. In a specific embodiment, the antagonist can be a peptide having the 
sequence of a portion of an SH2 domain of a STAT protein, or the phophotyrosine 
domaine of a STAT protein, or both. If the peptide contains both regions, 
preferably the regions are located in tandem, more preferably with the SH2 
domain portion N-terminal to the phosphotyrosine portion. In a specific example, 
10 injra^ such peptides are shown to be capable of disrupting dimerization of STAT 
proteins. 

One of the characteristics of the present receptor recognition factors is their 
participation in rapid phosphorylation and dephosphorylation during the course of 

15 and as part of their activity. Significantly, such phosphorylation takes place in an 
interferon-dependent manner and within a few minutes in the case of the ISGF-3 
proteins identified herein, on the tyrosine residues defined thereon. This is strong 
evidence that the receptor recognition factors disclosed herein are the first true 
substrates whose intracellular function is well understood and whose intracellular 

20 activity depends on tyrosine kinase phosphorylation. In particular, the addition of 
phosphate to the tyrosine of a transcription factor is novel. This suggests further 
that tyrosine kinase takes direct action in the transmission of intracellular signals to 
the nucleus, and does not merely serve as a promoter or mediator of serine and/or 
serinine kinase activity, as has been theorized to date. Also, the role of the factor 

25 represented by SEQ ID NO:2 in its activated phosphorylated form suggests 

i 

possible independent therapeutic use for this activated form. Likewise, the role of 
the factor as a tyrosine kinase substrate suggests its interaction with kinase in other 
theatres apart from the complex observed herein. 

30 The diagnostic utility of the present invention extends to the use of the present 
receptor recognition factors in assays to screen for tyrosine kinase inhibitors. 
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Because the activity of the receptor recognition-transcriptional activation proteins 
described herein must maintain tyrosine phosphorylation, they can and presumably 
are dephosphorylated by specific tyrosine phosphatases. Blocking of the specific 
phosphatase is therefore an avenue of pharmacological intervention that would 
5 potentiate the activity of the receptor recognition proteins. 

The present invention likewise extends to the development of antibodies against the 
receptor recognition factor(s), including naturally raised and recombinantly 
prepared antibodies. For example, the antibodies could be used to screen 

10 expression libraries to obtain the gene or genes that encode the receptor 
recognition factor(s). Such antibodies could include both polyclonal and 
monoclonal antibodies prepared by known genetic techniques, as well as bi- 
specific (chimeric) antibodies, and antibodies including other functionalities suiting 
them for additional diagnostic use conjunctive with their capability of modulating 

15 transcriptional activity. 

In particular, antibodies against specifically phosphorylated factors can be selected 
and are included within the scope of the present invention for their particular 
ability in following activated protein. Thus, activity of the recognition factors or 
20 of the specific polypeptides believed to be causally connected thereto may 

therefore be followed directly by the assay techniques discussed later on, through 
the use of an appropriately labeled quantity of the recognition factor or antibodies 
or analogs thereof. 

25 Thus, the receptor recognition factors, their analogs and/or analogs, and any 
antagonists or antibodies that may be raised thereto, are capable of use in 
connection with various diagnostic techniques, including immunoassays, such as a 
radioimmunoassay, using for example, an antibody to the receptor recognition 
factor that has been labeled by either radioactive addition, reduction with sodium 

30 borohydride, or radioiodination. 
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In an immunoassay, a control quantity of the antagonists or antibodies thereto, or 
the like may be prepared and labeled with an enzyme, a specific binding partner 
and/or a radioactive element, and may then be introduced into a cellular sample. 
After the labeled material or its binding partner(s) has had an opportunity to react 
5 with sites within the sample, the resulting mass may be examined by known 
techniques, which may vary with the nature of the label attached. For example, 
antibodies against specifically phosphorylated factors may be selected and 
appropriately employed in the exemplary assay protocol, for the purpose of 
following activated protein as described above, 

10 

In the instance where a radioactive label, such as the isotopes ^H, ^*C, ^^S, 
^Cl, ^^Cr, ^^Co, ^*Co, ^^e, ^^I, "'I, and ^**Re are used, known currently 
available counting procedures may be utilized. In the instance where the label is 
an enzyme, detection may be accomplished by any of the presentiy utilized 
15 colorimetric, spectrophotometric, fluorospectrophotometric, amperometric or 
gasometric techniques known in the art. 

The present invention includes an assay system which may be prepared in the form 
of a test kit for the quantitative analysis of the extent of the presence of the 

20 recognition factors, or to identify drugs or other agents that may mimic or block 
their activity. The system or test kit may comprise a labeled component prepared 
by one of the radioactive and/or enzymatic techniques discussed herein, coupling a 
label to the recognition factors, their agonists and/or antagonists, and one or more 
additional immunochemical reagents, at least one of which is a free or 

25 immobilized ligand, capable either of binding with the labeled component, its 
binding partner, one of the components to be determined or their binding 
partner(s). 
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In a further embodiment, the present invention relates to certain therapeutic 
methods which would be based upon the activity of the recognition factor (s), its 
(or tiieir) subunits, or active fragments thereof, or upon agents or other drugs 
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determined to possess the same activity, A first therapeutic method is associated 
with the prevention of the manifestations of conditions causally related to or 
following from the binding activity of the recognition factor or its subunits, and 
comprises administering an agent capable of modulating the production and/or 
5 activity of the recognition factor or subunits thereof, either individually or in 
mixture with each other in an amount effective to prevent the development of 
those conditions in the host. For example, drugs or other bmding partners to the 
receptor recognition/transcription factors or proteins may be administered to 
inhibit or potentiate transcriptional activity, as in the potentiation of interferon in 
10 cancer therapy. Also, the blockade of the action of specific tyrosine phosphatases 
in the dephosphorylation of activated (phosphorylated) reeognition/transcription 
factors or proteins presents a method for potentiating the activity of the recQ)tor 
recognition factor or protein that would concomitantly potentiate therapies based 
on receptor recognition factor/protein activation, 

15 

More specifically, the therapeutic method generally referred to herein could 
include the method for the treatment of various pathologies or other cellular 
dysfunctions and derangements by the administration of pharmaceutical 
compositions that may comprise effective inhibitors or enhancers of activation of 

20 the recognition factor or its subunits, or other equally effective drugs developed 
for instance by a drug screening assay prepared and used in accordance with a 
further aspect of the present invention. For example, drugs or other binding 
partners to the receptor recognition/transcription factor or proteins, as represented 
by SEQ ID NO: 2, may be administered to inhibit or potentiate transcriptional 

25 activity, as in the potentiation of interferon in cancer therapy. Also, the blockade 
of the action of specific tyrosine phosphatases in the dephosphorylation of 
activated (phosphorylated) recognition/transcription factor or protein presents a 
method for potentiating the activity of the receptor recognition factor or protein 
that would concomitantly potentiate therapies based on receptor recognition 

30 factor/protein activation. Correspondingly, the inhibition or blockade of the 
activation or binding of the recognition/transcription factor would affect MHC 
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Class n expression and consequently, would promote immunosuppression. 
Materials exhibiting this activity, as illustrated later on herein by staurosporine, 
may be useful in instances such as the treatment of autoimmune diseases and graft 
rejection, where a degree of immunosuppression is desirable, 

5 

In particular, the proteins of ISGF-3 whose sequences are presented in SEQ ID 
NOS:2, 4, 6, 8, 10 or 12 herein, their antibodies, agonists, antagonists, or active 
fragments thereof, could be prepared in pharmaceutical formulations for 
administration in instances wherein interferon therapy is appropriate, such as to 
10 treat chronic viral hepatitis, hairy cell leukemia, and for use of interferon in 
adjuvant therapy. The specificity of the receptor proteins hereof would make it 
possible to better manage the aftereffects of current interferon therapy, and would 
thereby make it possible to apply interferon as a general antiviral agent. 

15 Accordingly, it is a principal object of the present invention to provide a receptor 
recognition factor and its subunits in purified form that exhibits certain 
characteristics and activities associated with transcriptional promotion of cellular 
activity, 

20 It is a further object of the present invention to provide antibodies to the receptor 
recognition factor and its subunits, and methods for their preparation, including 
recombinant means. 

It is a further object of the present invention to provide a method for detecting the 
25 presence of the receptor recognition factor and its subunits in mammals in which 
invasive, spontaneous, or idiopathic pathological states are suspected to be present. 

It is a further object of the present invention to provide a method and associated 
assay system for screening substances such as drugs, agents and the like, 
30 potentially effective in either mimicking the activity or combating the adverse 
effects of the recognition factor and/or its subunits in mammals. 
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It is a still further object of the present invention to provide a method for the 
treatment of mammals to control the amount or activity of the recognition factor or 
subunits thereof, so as to alter the adverse consequences of such presence or 
activity, or where beneficial, to enhance such activity. 

5 

It is a still further object of the present invention to provide a method for the 
treatment of mammals to control the amount or activity of the recognition factor or 
its subunits, so as to treat or avert the adverse consequences of invasive, 
spontaneous or idiopathic pathological states. 

10 

It is a still further object of the present invention to provide pharmaceutical 
compositions for use in therapeutic methods which comprise or are based upon the 
recognition factor, its subunits, their binding partner(s), or upon agents or drugs 
that control the production, or that mimic or antagonize the activities of the 
15 recognition factors. 

Other objects and advantages will become apparent to those skilled in the art from 
a review of the ensuing description which proceeds with reference to the following 
illustrative drawings. 

20 

BRIEF DESCRIPTION OF THE DRAWINGS 



FIGURE 1 depicts the full receptor recognition factor nucleic acid sequence and 
the deduced amino acid sequence derived for the ISGF-3a gene defining the 113 
25 kD protein. The nucleotides are numbered from 1 to 2553 (S^Q ID NO;l), and 
the amino acids are numbered from 1 to 851 (SEQ ID NO:2). 



FIGURE 2 depicts the full receptor recognition factor nucleic acid sequence and 
the deduced amino acid sequence derived for the ISGF-3a gene defining the 91 kD 
30 protein. The nucleotides are numbered from 1 to 3943 (SEQ ID NO:3), and the 
amino acids are numbered from 1 to 750 (SEQ ID NO:4). 
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FIGURE 3 depicts the full receptor recognition factor nucleic acid sequence and 
the deduced amino acid sequence derived for the ISGF-3a gene defining the 84 kD 
protein. The nucleotides are numbered from I to 2166 (SEQ ID N0:5), and the 
amino acids are numbered from 1 to 712 (SEQ ID NO:6). 

FIGURE 4 shows the purification of ISGF-3. The left-hand portion of the Figure 
shows the purification of ISGF-3 demonstrating the polypeptides present after the 
first oligonucleotide affinity column (lane 3) and two different preparations after • 
the final chromatography step (Lanes 1 and 2). The left most lane contains 
protein size markers (High molecular weight, Sigma). ISGF-3 component proteins 
are indicated as 113 kD, 91 kD, 84 kD, and 48 kD [Kessler et al., GENES & 
DEV., 4 (1990); Levy et al., THE EMBO. J., 9 (1990)]. The right-hand portion 
of the Figure shows purified ISGF-3 from 2-3 x 10" cells was electroblotted to 
nitrocellulose after preparations 1 and 2 (Lanes 1 and 2) had been pooled and 
SQ)arated on a 7.5% SDS polyacrylamide gel. ISGF-3 component proteins are 
indicated. The two lanes on the right represent protein markers (High molecular 
weight, and prestained markers, Sigma). 

FIGURE 5 generally presents the results of Northern Blot analysis for the 91/84 
kD peptides. Figure 5a presents restriction maps for cDNA clones E4 (top map) 
and E3 (bottom map) showing DNA fragments that were radiolabeled as probes 
(probes A-D). Figure 5b comprises Northern blots of cytoplasmic HeLa RNA 
hybridized with the indicated probes. The 4.4 and 3.1 KB species as well as the 
28S and 18S rRNA bands are indicated. 

FIGURE 6 depicts the conjoint protein sequence of the 91 kD (SEQ ID NO:4) and 
84 kD (SEQ ID NO:6) proteins of ISGF-3- One letter amino acid code is shown 
for the open reading frame from clone E4, (encoding the 91 kD protein). The 84 
kD protein, encoded by a different cDNA (E3), has the identical sequence but 
terminates after amino acid 712, as indicated. Tryptic peptides tl9, tl3a, and tl3b 
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from the 91 kD protein are indicated. The sole recovered tryptic peptide from the 
84 kD protein, peptide t27, was whoUy contained within peptide tl9 as indicated. 

FIGURE 7 presents the results of Western blot and antibody shift analyses. 
5 a) Highly purified ISGF-3, fractionated on a 7.0% SDS polyacrylamide 

gel, was probed with antibodies a42 (amino acids 597-703); a55 (amino acids 
2-59); and a57 (amino acids 705-739) in a Western blot analysis. The silver 
stained part of the gel (lanes a, b, and c) illustrates the location of the ISGF-3 
component proteins and the purity of the material used in Western blot: Lane a) 
10 Silver stain of protein sample used in all the Western blot experiments (immune 
and preimmune). Lane b) "Material of equal purity to that shown in Fig. 4, for 
clearer identification of the ISGF-3 proteins. Lane c) Size protein markers 
indicated, 

b) Antibody interference of the ISGF-3 shift complex; Lane a) The 
15 complete ISGF-3 and the free ISGF-37 component shift with partially purified 
ISGF-3 are marked; Lane b) Competition with a 100 fold excess of cold ISRE 
oligonucleotide. Lane c) Shift complex after the addition of 1 ml of preimmune 
serum to a 12.5 ^1 shift reaction. Lanes d and e) - Shift complex after the addition 
of 1 jul of a 1:10 dilution or 1 ml of undiluted a42 antiserum to a 12.5 /zl shift 
20 reaction. 

Methods : 

Antibodies a42, a55 and a57 were prepared by injecting approximately 500 mgm 
of a fusion protein prepared in E. coli using the GE3-3X vector [Smith et al., 
25 GENE, 67 (1988)]. Rabbits were bled after the second boost apd serum prepared. 

For Western blots highly purified ISGF-3 was separated on a 7% SDS 
polyacrylamide gel and electroblotted to nitrocellulose. The filter was incubated in 
blocking buffer ("blotto"), cut into strips and probed with specific antiserum and 
30 preimmune antiserum diluted 1:500. The immune complexes were visualized with 
the aid of an ECL kit (Amersham). Shift analyses were performed as previously 
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described [Levy et al., GENES <ft DEV., 2 (1988); Levy et al., GENES & DEV., 
3 (1989)] in a 4.5% polyacrylamide gel. 

FIGURE 8 presents the ftiU length amino acid sequence of 113 kD protein 
components of ISGF-3a (SEQ ID NO:2) and alignment of conserved amino acid 
sequences between the 113 kD and 91/84 kD proteins (SEQ ID NOS:4 AND 6). 

A. Polypeptide sequences (A-E) derived from protein micro-sequencing of 
purified 113 kD protein (see accompanying paper) are underlined. Based on 
peptide E, we designed a degenerate oligonucleotide, 

AAT/CACIGAA/GCCIATGGAA/GA1T/CATT (SEQ ID NO: 13), which was 
used to screen a cDNA library [Pine et la., MOL. CELLrBIOL., 10 (1990)] 
basically as described [Norman et al., CELL, 55 (1988)]. Briefly, the degenerate 
oligonucleotides were labeled by 32P-7-ATP by polynucleotide kinase, 
hybridizations were carried out overnight at 40''C in 6 x SSTE (0.9 M NaCl, 60 
mM Tris-HCl [pH 7.9] 6mM EDTA), 0.1%SDS, 2mM NazFjO,, 6 mM KH2PO4 
in the presence of 100 mg/ml salmon sperm DNA sperm and 10 x Denhardt's 
solution [Maniatis et al., MOLECULAR CLONING; A LABORATORY MANUAL 
(Cold Spring Harbor Lab., 1982)]. The nitrocellulose filters then were washed 4 
X 10 min. with the same hybridization conditions without labeled probe and 
salmon sperm DNA. Autoradiography was carried out at -80 °C with intensifying 
screen for 48 hrs. A PGR product was obtained later by the same method 
described for the 91/84 kD sequences, by using oligonucleotides designed 
according polypeptide D and E. The sequence of this PGR product was identical 
to a region in clone fU. The fuU length of 113 kD protein contains 851 amino 
acids. Three major helices in the N-terminal region were predicted by the 
methods of both Chou and Fasman [Chou et al., ANN. REV. BIOCHEM., 47 
(1978)] and Gamier et al [Gamier et al., /. MOL. BIOL., 12 (1978)] and are 
shown in shadowed boxes. At the C-terminal end, a highly negative charged 
domain was found. All negative charged residues are blackened and positive 
charged residues shadowed. The five polypeptides that derived from protein 
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microscreening [Aebersold et al,, PROC. NATL. ACAD. SCI USA, 87 (1987)] are 
underlined. 

B) Comparison of amino acid sequences of 113 kD and 91/84 kD protein 
shows a 42% identical amino acid residues in the overlapping 715 amino acid 
5 sequence shown. In the middle helix region four leucine and one valine heptad 
repeats were identified in both 113 and 91/84 kD protein (the last leucine in 91/84 
kD is not exactly preserved as heptad repeats). When a heligram structure was 
drawn this heUx is amphipathic (not shown). Another notable feature of this 
comparison is several tyrosine residues that are conserved in both proteins near 
10 their ends, 

FIGURE 9 shows the in vitro transcription and translation of 113 kD and 91 kD 
cDNA and a Northern blot analysis with 113 kD cDNA probe. 

a) The full length cDNA clones of 113 and 91 kD protein were 

15 transcribed in vitro and transcribed RNAs was translated in vitro with rabbit 
lenticulate lysate (Promega; conditions as described in the Promega protocol). 
The mRNA of BMV (Promega) was simultaneously translated as a protein size 
marker. The 113 cDNA yielded a translated product about 105 kD and the 91 
cDNA yielded a 86 kD product. 

20 b) When total cytoplasmic mRNAs isolated from superinduced HeLa cells 

were utilized, a single 4.8 KB mRNA band was observed with a cDNA probe 
coding for C-end of 113 kD protein in a Northern blot analysis [Nielsch et al.. 
The EMBO. J,, 10 (1991)1 

25 FIGURE 10(A) presents the results of Western blot analysis confirming the 

identity of the 113 kD protein. An antiserum raised against a polypeptide segment 
[Harlow et al., ANTIBODIES; A LABORATORY MANUAL (Cold Spring Harbor 
Lab., 1988)] from amino acid 500 to 650 of 113 kD protein recognized 
specifically a 113 kD protein in a protein Western blot analysis. The antiserum 

30 recognized a band both in a highly purified ISGF-3 fi^ction (> 10,000 fold) fi-om 
DNA affinity chromatography and in the crude extracts prqpared fi-om y and a 
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IFN treated HeLa ceUs [Fu et al,, PROC. NATL. ACAD. SCL USA, 87 (1990)]. 
The antiserum was raised against a fusion protein [a cDNA fragment coding for 
part of 1 13 kD protein was inserted into pGEX-2T, a high expression vector in the 
E, coU [Smith et al., PROC. NATL. ACAD. SCL USA, 83 (1986)] purified from 
5 E, coli [Smith et al., GENE, 67 (1988)]. The female NZW rabbits were 

immunized with 1 mg fusion protein in Freund's adjuvant. Two subsequent boosts 
two weeks apart were carried out with 500 mg fusion protein. The Western blot 
was carried out with conditions described previously [Pine et al., MOL. CELL. 
BIOL., 10 (1990)], 

10 

FIGURE 10(B) presents the results of a mobility shift assay showing that the 
anti-113 antiserum affects the ISGF-3 shift complex. Preimmune serum or the 113 
kD antiserum was added to shift reaction carried out as described [Fu et al. 
PROC NATL. ACAD. SCL USA, 87 (1990); Kessler et al. GENES & DEV., 4, 
15 (1990)] at room temperature for 20 min. then one-third of reaction material was 
loaded onto a 5% polyacrylamide gel. In addition unlabeled probe was included in 
one reaction to show specificity of the gel shift complexes. 

FIGURE 1 1 shows the results of experiments investigating the IFN-a dependent 
20 phosphorylation of 113, 91 and 84 kD proteins. Protein samples from cells 
treated in various ways after 60 min. exposure to ^^P04'^ were precipitated with 
antiserum to 113 kD protein. Lane 1, no treatment of cells; Lane 2, cells treated 
7 min. with IFN-oc. By comparison with the marker proteins labeled 200, 97.5, 
69 and 46 kD (kilo daltons), the PO4*' labeled proteins in the precipitate are seen 
25 to be 113 and 91 kD. Lane 3, cells treated with IFN-7 overnight (no 

phosphorylated proteins) and then (Lane 4) treated with IFN-a for 7 min. show 
heavier phosphorylation of 113, 91 and 84 kD. 

FIGURE 12 is a chromatogram depicting the identification of phosphoamino acid. 
30 Phosphate labeled protein of 113, 91 or 84 kD size was hydrolyzed and 

chromatographed to reveal newly labeled phosphotyrosine. Cells untreated with 
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EFN showed only phosphoserine labeL (P Ser = phosphoserine; P Thr = 
phosphothreonine; P Tyr = phosphotyrosine. 

FIGURE 13 depicts (A) the deduced amino acid sequence (SEQ ID NO: 8) of and 
(B-D) the DNA sequence (SEQ ID N0:7) encoding the murine 91 kD intracellular 
receptor recognition factor, 

FIGURE 14 depicts (A) the deduced amino acid sequence (SEQ ID NO: 10) of and 
(B-C) the DNA sequence (SEQ ID NO:9) encoding the 13sfl intracellular receptor 
recognition factor. 

FIGURE 15 depicts (A) the deduced amino acid sequence (SEQ ID NO: 12) of and 
(B-C) the DNA sequence (SEQ ID NO: 11) encoding the 19sf6 intracellular 
receptor recognition factor. 

FIGURE 16. Determination of molecular weights of Stat91 and phospho Stat91 
by native gel analysis. 

A) Western blot analysis of fractions from affinity purification. Extracts from 
human FS2 fibroblasts treated with IFN-7 (Ext), the unbound fraction (Flow), the 
fraction washed with Buffer A0.2 (A0.2), and the bound fraction eluted with 
buffer A0.8(A0.8) were immunoblotted with anti-91T. 

B) Native gel analysis. Phosphorylated Stat91 (the A0.8 fraction from A) and 
unphosphorylated Stat91 (the Flow fi^ction from A) were analyzed on 4.5%, 
5,5%, 6.5% and 7.5% native polyacrylamide gels followed by immunoblotting 
with anti-91T. The top of gels (TOP) and the migration positipn of bromophenol 
blue (BPB) are indicated. 

C) Ferguson plots. The relative mobilities (Rm) of the Stat91 and phospho Stat91 
were obtained from Figure IB (see Experimental Procedures). Closed circle: 
Chicken egg albumin (45kD); Cross: Bovine serum albumin, monomer (66 kD); 
Open square: Bovine serum albumin, dimer (132 kD); Open circle: Urease, trimer 



(272 kD); Open triangle: Unphosphorylated Stat91; Closed triangle: 
Phosphorylated Stat91. 

D) Determination of molecular weights from the standard curve. The molecular 
weights of phosphorylated and unphosphorylated Stat91 proteins (indicated as 
5 closed and open arrows, respectively) were obtained by extrapolation of their 
retardation coefficients. 

FIGURE 17. Determination of molecular weights by glycerol gradients. 

A) Western blot analysis. Extracts from human Bud8 fibroblasts treated with IFN- 
10 7 (the rightmost lane) and every other fraction from fraction 16 to 34 were 

analyzed on 7.5% SDS-PAdlE followed by immunobloting with anti-91T. The 
peak of phosphorylated Stat91 (fraction 20) and the peak of unphosphorylated 
Stat91 (fraction 30) were indicated by a closed and open arrow, respectively. 

B) Mobility shift analysis. Every other fractions from the gradients were 
15 analyzed. 

C) Graphic representation of the data from A and B. Peak fraction numbers of 
protein standards are plotted versus their molecular weight. The position of peaks 
(of phosphorylated and unphosphorylated Stat91 protein are indicated by the closed 
and open arrows, respectively. Standards are ferritin (Per, 440 kD), catalase (Cat, 

20 232 kD), ferritin half unit (Per 1/2, 220 kD), aldolase (Aid, 158 kD), bovine 
serum albumin (BSA, 68 kD). 

FIGURE 18. Stat91 in cell extracts binds DNA as a dimer. 

A) Wester blot analysis. Extracts from stable cell lines expressing either Stat84 
25 (C84), or Stat91L (C91L) or both (Cmx) were analyzed on 7.5,% SDS-PAGE 

followed by immunobloting with anti-91. 

B) Gel mobility shift analysis. Extracts from stable cell lines (Fig 3A) untreated 
(-) or treated with IFN-7(+) were analyzed. The positions of Stat91 homodimer 
(91L), Stat84 homodimer (84), and the heterodimer (84*91) are indicated. 
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FIGURE 19. Formation of herterodimer by denaturation and renaturation. 
Cytoplasmic (Left Panel) or nuclear extracts (Right Panel) from IFN-7-treated cell 
lines expressing either Stat84 (C84) or Stat91 (C91) were analyzed by gel mobility 
shift assays. +: with addition; without addition; D/R: samples were subjected 
5 to guanidinium hydrochloride denaturation and renaturation treatment. 

FIGURE 20. Diagramatic representation of dissociation and reassociation 
analysis. 

10 FIGURE 21. Dissociation-reassociation analysis with peptides. Gel mobility shift 
analysis with IFN-7 treated nuclear extracts from cell lines expressing Stat91L 
(C91L, lane 15) or Stat84 (C84, lane 14) or mixture of both Gane 1-13, 16-18) in 
the presence of increasing concentrations of various peptides. 91-Y, 
unphosphorylated peptide from Stat91 (LDGPKGTGYIKTELI) (SEQ, ID 

15 N0.:18); 91Y-p, phosphotyrosyl peptide from Stat91 (GY*IKTE) (SEQ ID 

NO,: 19); 113Y-p, phosphotyrosyl peptide with high binding affinity to Src SH2 
domain (EPQY*EEIPIYL, Songyang et al., 1993, Cell 72:767-778) (SEQ. ID 
NO.:21), Final concentrations of peptides added: 1 fiM (lane 8), 4 fiM (lane 2,5, 
11), 10 (lane 9), 40 Cane 3, 6, 10, 12, 14-18), 160 Oane 4, 7, 13). 

20 -f: with addition; -: without addition. Right panel: antiserum tests for identity 
of gel-shift bands (see Figure 3). 

FIGURE 22. Dissociation-reassociation analysis with GST fusion proteins. A) - 
SDS-PAGE (12%) analysis of purified GST fusion proteins as visualized by 
25 Commasie blue. GST-91 SH3, native SH2 domain of Stat91; GST-91 mSH2, R^ 
to mutant; GST-91 SH3, SH3 domain of Stat91; GST Src SH2, the SH2 
domain of src protein. Same amounts (1 fig) of each fusion proteins were loaded. 
Protein markers were run in lane 1 as indicated. 

B) Dissociation-reassociation analysis similar to Figure 6. Dissociating agents 
30 were GST fusion proteins purified ft"om bacterial expression as shown above. 
Final concentrations of fusion proteins added are 0.5 /xM (lanes 2, 5, 8, 11, 14), 
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2-5 /iM Oanes 3, 6, 9, 12, 15) and 5 ^lU (Lanes 4, 7, 10, 13, 17, 18). +: with 
addition; -: without addition; FP: fusion proteins. 

FIGURE 23. Comparison of Stat91 SH2 structure with known SH2 structures. 
5 The Stat91 sequence is disclosed herein (SEQ ID NO: 4). The structures used for 
the other SH2s are Src (Waksman et al,, 1992, Nature 358:646-653) (SEQ ID 
NO:22), Abl (Overduin et al., 1992, Proc. Nafl. Acad. Sci. USA 89:11673-77 and 

1992, CeU 70:697-704) (SEQ ID NO:23, Lck (Eck et al., 1993, Nature 362:87- 
91) (SEQ ID NO:24), and p85aN (Booker et al., 1992, Nature 358:684-687) 

10 (SEQ ID NO:25). The alignn\ent of the determined structures is by direct 

coordinate superimposition of the backbone structures. The names of secondary 
structural features and significant residues is based on the scheme of Eck et al., 

1993. The boundaries and extents of the structure features are indicated by [ — ], 
The starting numbers for the parent sequences are shown in parentheses, 

15 Experimentally determined structurally conserved regions are from Src, p85a, and 
Abl (Cowbum, unpublished). The root mean square deviation of three- 
dimensionally aligned structures differs by less than 1 Angstrom for the backbone 
non-hydrogen atoms in the sections marked by the XXX, 

20 DETAILED DESCRIPTION 

In accordance with the present invention there may be employed conventional 
molecular biology, microbiology, and recombinant DNA techniques within the 
skill of the art. Such techniques are explained fully in the literature. See, e,g,, 

25 Maniatis, Fritsch & Sambrook, "Molecular Cloning: A Laboratory Manual" 

(1982); '•DNA Cloning: A Practical Approach," Volumes I and n (D.N. Glover 
ed. 1985); "Oligonucleotide Synthesis" (M.J. Gait ed. 1984); "Nucleic Acid 
Hybridization" [B.D. Hames & S.J. Higgins eds. (1985)]; "Transcription And 
Translation" [B.D. Hames & S.J. Higgins, eds. (1984)]; "Animal Cell Culture" 

30 [R.L Freshney, ed. (1986)]; "Immobilized Cells And Enzymes" [IRL Press, 
(1986)]; B. Perbal, "A Practical Guide To Molecular Cloning" (1984). 
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Therefore, if appearing herein, the following terms shall have the definitions set 
out below. 

The terms "receptor recognition factor", "receptor recognition-tyrosine kinase 
5 factor", "receptor recognition factor/tyrosine kinase substrate", "receptor 
recognition/transcription factor", "recognition factor" and "recognition factor 
protein(s)" and any variants not specifically listed, may be used herein 
interchangeably, and as used throughout the present application and claims refer to 
proteinaceous material including single or multiple proteins, and extends to those 

10 proteins having the amino acid sequence data described herein and presented in 
FIGURE 1 (SEQ ID NO:2),'FIGURE 2 (SEQ ID NO:4f and in FIGURE 3 (SEQ 
ID NO: 6), and the profile of activities set forth herein and in the Claims. 
Accordingly, proteins displaying substantially equivalent or altered activity are 
likewise contemplated. These modifications may be deliberate, for example, such 

15 as modifications obtained through site-directed mutagenesis, or may be accidental, 
such as those obtained through mutations in hosts that are producers of the 
complex or its named subunits. Also, the terms "receptor recognition factor", 
"recognition factor" and "recognition factor protein (s)" are intended to include 
within their scope proteins specifically recited herein as well as all substantially 

20 homologous analogs and allelic variations. 

The amino acid residues described herein are preferred to be in the "L" isomeric 
form. However, residues in the "D" isomeric form can be substituted for any L- 
amino acid residue, as long as the desired fuctional property of immunoglobulin- 
25 binding is retained by the polypeptide. NH2 refers to the free amino group 

present at the amino terminus of a polypeptide. COOH refers to the free carboxy 
group present at the carboxy terminus of a polypeptide. In keeping with standard 
polypeptide nomenclature, J. Biol. Chem., 243:3552-59 (1969), abbreviations for 
amino acid residues are shown in the following Table of Correspondence: 

30 



SYMBOL 


OF CORRESPO^fD^NC^ 

AMINO ACID 




3-Letter 




Y 


Tyr 


tyrosine 


G 


Gly 


glycine 


F 


Phe 


phenylalanine 


M 


Met 


methionine 


A 


Ala 


alanine 


S 


Ser 


serine 


I 


ne 


isoleucine 


L 


Leu 


leucine 


T 


Thr 


threonine 


V 


Val 


valine 


P 


Pro 


proline 


K 


Lys 


lysine 


H 


ffis 


histidine 


Q 


Gin 


glutamine 


E 


Glu 


glutamic acid 


W 


Trp 


tryptophan 


R 


Arg 


arginine 


D 


Asp 


aspartic acid 


N 


Asn 


asparagine 


C 


Cys 


cysteine 



25 It should be noted that all amino-acid residue sequences are represented herein by 
formulae whose left and right orientation is in the conventional direction of amino- 
terminus to carboxy-terminus. Furthermore, it should be noted that a dash at the 
beginning or end of an amino acid residue sequence indicates a peptide bond to a 
further sequence of one or more amino-acid residues. The above Table is 

30 presented to correlate the three-letter and one-letter notations which may appear 
alternately herein. 
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A "replicon- is any genetic element (e.g., plasnxid, chromosome, virus) that 
functions as an autonomous unit of DNA replication in vivo; i.e,, capable of 
replication under its own control. 

A **vector- is a replicon, such as plasmid, phage or cosmid, to which another 
DNA segment may be attached so as to bring about the replication of the attached 
segment. 

A "DNA molecule" refers to the polymeric form of deoxyribonucleotides (adenine, 
guanine, thymine, or cytosine) in its either single stranded form, or a double- 
stranded helix. This term refers only to the primary and secondary structure of 
the molecule, and does not limit it to any particular tertiary forms. Thus, this 
term includes double-stranded DNA found, inter alia, in linear DNA molecules 
(e.g., restriction fragments), viruses, plasmids, and chromosomes. In discussing 
the structure of particular double-stranded DNA molecules, sequences may be 
described herein according to the normal convention of giving only the sequence in 
the 5' to 3' direction along the nontranscribed strand of DNA (i.e., the strand 
having a sequence homologous to the mRNA). 

An "origin of replication" refers to those DNA sequences that participate in DNA 
synthesis. 

A DNA "coding sequence" is a double-stranded DNA sequence which is 
transcribed and translated into a polypeptide in vivo when placed under the control 
of appropriate regulatory sequences. The boundaries of the coding sequence are 
determined by a start codon at the 5* (amino) terminus and a translation stop 
codon at the 3* (carboxyl) terminus. A coding sequence can include, but is not 
limited to, prokaryotic sequences, cDNA from eukaryotic mRNA, genomic DNA 
sequences from eukaryotic (e.g., mammalian) DNA, and even synthetic DNA 
sequences. A polyadenylation signal and transcription termination sequence will 
usually be located 3* to the coding sequence. 



30 

Transcriptional and translational control sequences are DNA regulatory sequences, 
such as promoters, enhancers, polyadenylation signals, tenninators, and the like, 
that provide for the expression of a coding sequence in a host cell, 

5 A "promoter sequence" is a DNA regulatory region capable of binding RNA 
polymerase in a cell and initiating transcription of a downstream (3* direction) 
coding sequence. For purposes of defining the present invention, the promoter 
sequence is bounded at its 3' terminus by the transcription initiation site and 
extends upstream (5' direction) to include the minimum number of bases or 

10 elements necessary to initiate transcription at levels detectable above background. 
Within the promoter sequence will be found a transcription initiation site 
(conveniently defined by mapping with nuclease SI), as well as protein binding 
domains (consensus sequences) responsible for the binding of RNA polymerase. 
Eukaryotic promoters will often, but not always, contain "TATA** boxes and 

15 "CAT" boxes. Prokaryotic promoters contain Shine-Dalgamo sequences in 
addition to the -10 and -35 consensus sequences. 

An "expression control sequence" is a DNA sequence that controls and regulates 
the transcription and translation of another DNA sequence. A coding sequence is 
20 "under the control" of transcriptional and translational control sequences in a cell 
when RNA polymerase transcribes the coding sequence into mRNA, which is then 
translated into the protein encoded by the coding sequence. 

A "signal sequence" can be included before the coding sequence. This sequence 
25 encodes a signal peptide, N-terminal to the polypeptide, that communicates to the 

i 

host cell to direct the polypeptide to the cell surface or secrete the polypeptide into 
the media, and this signal peptide is clipped off by the host cell before the protein 
leaves the cell. Signal sequences can be found associated with a variety of 
proteins native to prokaryotes and eukaryotes. 

30 
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The term "oligonucleotide", as used herein in referring to the probe of the present 
invention, is defined as a molecule comprised of two or more ribonucleotides, 
preferably more than three. Its exact size will depend upon many factors which, 
in turn, depend upon the ultimate function and use of the oligonucleotide. 

5 

The term "primer" as used herein refers to an oligonucleotide, whether occurring 
naturally as in a purified restriction digest or produced synthetically, which is 
capable of acting as a point of initiation of synthesis when placed under conditions 
in which synthesis of a primer extension product, which is complementary to a 

10 nucleic acid strand, is induced, i.e., in the presence of nucleotides and an inducing 
agent such as a DNA polymerase and at a suitable temperature and pH. The 
primer may be either single-stranded or double-stranded and must be sufficiently 
long to prime the synthesis of the desired extension product in the presence of the 
inducing agent. The exact length of the primer will depend upon many factors, 

15 including temperature, source of primer and use of the method. For example, for 
diagnostic applications, depending on the complexity of the target sequence, the 
oligonucleotide primer typically contains 15-25 or more nucleotides, although it 
may contain fewer nucleotides. 

20 The primers herein are selected to be "substantially" complementary to different 
strands of a particular target DNA sequence. This means that the primers must be 
sufficiently complementary to hybridize with their respective strands. Therefore, 
the primer sequence need not reflect the exact sequence of the template. For 
example, a non-complementary nucleotide fragment may be attached to the 5' end 

25 of the primer, with the remainder of the primer sequence being complementary to 
the strand. Alternatively, non-complementary bases or longer sequences can be 
interspersed into the primer, provided that the primer sequence has sufficient 
complementarity with the sequence of the strand to hybridize therewith and 
thereby form the template for the synthesis of the extension product. 



30 
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As used herein, the terms "restriction endonucleases" and "restriction enzymes" 
refer to bacterial enzymes, each of which cut double-stranded DNA at or near a 
specific nucleotide sequence. 

A cell has been "transformed" by exogenous or heterologous DNA when such 
DNA has been introduced inside the cell. The transforming DNA may or may not 
be integrated (covalently linked) into chromosomal DNA making up the genome of 
the cell. In prokaryotes, yeast, and mammalian cells for example, the 
transforming DNA may be maintained on an episomal element such as a plasmid. 
With respect to eukaryotic cells, a stably transformed cell is one in which the 
transforming DNA has become integrated into a chromosome so that it is inherited 
by daughter cells through chromosome replication. This stability is demonstrated 
by the ability of the eukaryotic cell to establish cell lines or clones comprised of a 
population of daughter cells containing the transforming DNA. A "clone** is a 
population of cells derived from a single cell or common ancestor by mitosis. A 
"cell line" is a clone of a primary cell that is capable of stable growth in vitro for 
many generations. 

Two DNA sequences are "substantially homologous" when at least about 75% 
(preferably at least about 80%, and most preferably at least about 90 or 95%) of 
the nucleotides match over the defined length of the DNA sequences. Sequences 
that are substantially homologous can be identified by comparing the sequences 
using standard software available in sequence data banks, or in a Southern 
hybridization experiment under, for example, stringent conditions as defined for 
that particular system. Defining appropriate hybridization conditions is within the 
skill of the art. See, e.g., Maniatis et al., supra; DNA Cloning, Vols. I & H, 
supra; Nucleic Acid Hybridization, supra, 

A "heterologous" region of the DNA construct is an identifiable segment of DNA 
within a larger DNA molecule that is not found in association with the larger 
molecule in nature. Thus, when the heterologous region encodes a mammalian 
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gene, the gene will usually be flanked by DNA that does not flank the mammalian 
genomic DNA in the genome of the source organism. Another example of a 
heterologous coding sequence is a construct where the coding sequence itself is not 
found in nature (e.g., a cDNA where the genomic coding sequence contains 
5 introns, or synthetic sequences having codons different than the native gene). 
Allelic variations or naturally-occurring mutational events do not give rise to a 
heterologous region of DNA as defined herein. 

An "antibody" is any immunoglobulin, including antibodies and fragments thereof, 
10 that binds a specific epitope. The term encompasses polyclonal, monoclonal, and 
chimeric antibodies, the last mentioned described in further detail in U.S. Patent 
Nos. 4,816,397 and 4,816,567. 

An "antibody combining site" is that structural portion of an antibody molecule 
15 comprised of heavy and light chain variable and hypervariable regions that 
specifically binds antigen. 

The phrase "antibody molecule" in its various grammatical forms as used herein 
contemplates both an intact immunoglobulin molecule and an immunologically 
20 active portion of an immunoglobulin molecule. 

Exemplary antibody molecules are intact immunoglobulin molecules, substantially 
intact immunoglobulin molecules and those portions of an immunoglobulin 
molecule that contains the paratope, including those portions known in the art as 
25 Fab, Fab', F(ab02 and F(v), which portions are preferred for use in the 
therapeutic methods described herein. 

Fab and F(ab')2 portions of antibody molecules are prq)ared by the proteolytic 
reaction of papain and pepsin, respectively, on substantially intact antibody 
30 molecules by methods that are well-known. See for example, U.S. Patent No. 
4,342,566 to Theofilopolous et al. Fab' antibody molecule portions are also well- 



34 

known and are produced from F(ab')2 portions followed by reduction of the 
disulfide bonds linking the two heavy chain portions as with mercaptoethanol, and 
followed by alkylation of the resulting protein mercaptan with a reagent such as 
iodoacetamide. An amtibody containing intact antibody molecules is preferred 
5 herein. 

The phrase "monoclonal antibody" in its various grammatical forms refers to an 
antibody having only one species of antibody combining site capable of 
immunoreacting with a particular antigen, A monoclonal antibody thus typically 
10 displays a single binding affinity for any antigen with which it immunoreacts, A 
monoclonal antibody may therefore contain an antibody molecule having a 
plurality of antibody combining sites, each immunospecific for a different antigen; 
e.g., a bispecific (chimeric) monoclonal antibody. 

15 The phrase "pharmaceutically acceptable" refers to molecular entities and 

compositions that are physiologically tolerable and do not typically produce an 
allergic or similar untoward reaction, such as gastric upset, dizziness and the like, 
when administered to a human. 

20 The phrase "therapeutically effective amount" is used herein to mean an amount 
sufficient to prevent, and preferably reduce by at least about 30 percent, more 
preferably by at least 50 percent, most preferably by at least 90 percent, a 
clinically significant change in the S phase activity of a target cellular mass, or 
other feature of pathology such as for example, elevated blood pressure, fever or 

25 white cell count as may attend its presence and activity. 

i 

A DNA sequence is "operatively linked" to an expression control sequence when 
the expression control sequence controls and regulates the transcription and 
translation of that DNA sequence. The term "operatively linked" includes having 
30 an appropriate start signal (e.g., ATG) in front of tiie DNA sequence to be 

expressed and maintaining the correct reading frame to permit expression of the 
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DNA sequence under the control of the expression control sequence and 
production of the desired product encoded by the DNA sequence. If a gene that 
one desires to insert into a recombinant DNA molecule does not contain an 
appropriate start signal, such a start signal can be inserted in front of the gene. 

5 

The term "standard hybridization conditions" refers to salt and temperature 
conditions substantially equivalent to 5 x SSC and 65**C for both hybridization and 
wash. 

10 In its primary aspect, the present invention concerns the identification of a 
receptor recognition factor, and the isolation and sequencing of a particular 
receptor recognition factor protein, that is believed to be present in cytoplasm and 
that serves as a signal transducer between a particular cellular receptor having 
bound thereto an equally specific polypeptide ligand, and the comparably specific 

15 transcription factor that enters the nucleus of the cell and interacts with a specific 
DNA binding site for the activation of the gene to promote the predetermined 
response to the particular polypeptide stimulus. The present disclosure confirms 
that specific and individual receptor recognition factors exist that correspond to 
known stimuli such as tumor necrosis factor, nerve growth factor, platelet-derived 

20 growth factor and the like. Specific evidence of this is set forth herein with 
respect to the interferons a and 7 (IFNa and IFN7). 

A further property of the receptor recognition factors (also termed herein signal 
transducers and activators of transcription ~ STAT) is dimerization to form 

25 homodimers or heterodimers upon activation by phosphorylation of tyrosine. In a 
specific embodiment, infra, Stat91 and Stat84 form homodimerk and a Stat91- 
Stat84 heterodimer. Accordingly, the present invention is directed to such dimers, 
which can form spontaneously by phophorylation of the STAT protein, or which 
can be prepared synthetically by chemically cross-linking two like or unlike STAT 

30 proteins. 
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The present receptor recognition factor is likewise noteworthy in that it appears 
not to be demonstrably affected by fluctuations in second messenger activity and 
concentration. The receptor recognition factor proteins appear to act as a substrate 
for tyrosine kinase domains, however do not appear to interact with G-proteins, 
5 and therefore do not appear to be second messengers. 

A particular receptor recognition factor identified herein by SEQ ID NO:4, has 
been determined to be present in cytoplasm and serves as a signal transducer and a 
specifice transcription factor in response to IFN-7 stimulation that enters the 

10 nucleus of the cell and interacts directiy with a specific DNA binding site for the 
activation of the gene to promote the predetermined response to the particular 
polypeptide stimulus. This particular factor also acts as a translation protein and, 
in particular, as a DNA binding protein in response to interferon-7 stimulation. 
This factor is likewise noteworthy in that it has the following characteristics: 

15 a) It interacts with an interferon-7-bound receptor kinase complex; 

b) It is a tyrosine kinase substrate; and 

c) When phosphorylated, it serves as a DNA binding protein. 

More particularly, the factor of SEQ ID NO:4 directiy interacts with DNA after 
20 acquiring phosphate on tyrosine located at position 701 of the amino acid 

sequence. Also, interferon-y-dependent activation of this factor occurs without 
new protein synthesis and appears within minutes of interferon-7 treatment, 
achieves maximum extent between 15 and 30 minutes thereafter, and then 
disappears after 2-3 hours, 

25 

In a particular embodiment, the present invention relates to all inembers of the 
herein disclosed family of receptor recognition factors except the 91 kD protein 
factors, specifically the proteins whose sequences are represented by one or more 
of SEQ ID N0:4, SEQ ID NO:6 or SEQ ID NO:8. 
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Subsequent to the filing of the initial applications directed to the present invention, 
the inventors have termed each member of the family of receptor recognition 
factors as a signal iransducer and activator of transcription (STAT) protein. Each 
STAT protein is designated by the apparent molecular weight (e.^., Statll3, 
5 Stat91, Stat84, etc,), or by the order in which it has been identified (e.g,, Statla 
[Stat91], StatliS [Stat84], Stat2 [StatllS], StatS [a murine protein described in 
U.S. AppUcation Serial No. 08/126,588, filed September 24, 1993 as 19sf6], and 
Stat4 [a murine STAT protein described in U.S. Application Serial No. 
08/126,588, filed September 24, 1993 as 13sfl]), As wiU be readily appreciated 
10 by one of ordinary skill in the art, the choice of name has no effect on the 

intrinsic characteristics of the factors described herein, which were first disclosed 
in U.S. Application Serial No. 07/845,296, filed March 19, 1992. The present 
inventors have chosen to adopt this newly derived terminology herein as a 
convenience to the skilled artisan who is familiar with the subsequently published 
15 papers relating to the same, and in accordance with the proposal to harmonize the 
terminology for the novel class of proteins, and nucleic acids encoding the 
proteins, disclosed by the instant inventors. The terms [molecular weight] kd 
receptor recognition factor, Stat[molecuIar weight], and Stat[number] are used 
herein interchangeably, and have the meanings given above. For example, the 
terms 91 kd protein, Stat91, and Statlor refer to the same protein, and in the 
appropriate context refer to the nucleic acid molecule encoding such protein. 

As stated above, the present invention also relates to a recombinant DNA molecule 
or cloned gene, or a degenerate variant thereof, which encodes a receptor 
recognition factor, or a fragment thereof, that possesses a molecular weight of 
about 113 kD and an amino acid sequence set forth in FIGURfe 1 (SEQ ID NO:2); 
preferably a nucleic acid molecule, in particular a recombinant DNA molecule or 
cloned gene, encoding the 113 kD receptor recognition factor has a nucleotide 
sequence or is complementary to a DNA sequence shown in FIGURE 1 (SEQ ID 
NO:l). In another embodiment, the receptor recognition factor has a molecular 
weight of about 91 kD and the amino acid sequence set forth in FIGURE 2 (SEQ 
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ID NO:4) or FIGURE 13 (SEQ ID NO: 8); preferably a nucleic acid molecule, in 
particular a recombinant DNA molecule or cloned gene, encoding the 91 kD 
receptor recognition factor has a nucleotide sequence or is complementary to a 
DNA seqnece shown in FIGURE 2 (SEQ ID NO:3) or FIGURE 13 (SEQ ID 
NO: 8). In yet a further embodiment, the receptor recognition factor has a 
molecular weight of about 84 kD and the amino acid sequence set forth in 
FIGURE 3 (SEQ ED NO: 6); preferably a nucleic acid molecule, in particular a 
recombinant DNA molecule or cloned gene, encoding the 84 kD receptor 
recognition factor has a nucleotide sequence or is complementary to a DNA 
seqnece shown in FIGURE 3 (SEQ ID NO:5), In yet another embodiment, the 
receptor recognition factor has an amino acid sequence set forth in FIGURE 14 
(SEQ ID NO: 10); preferably a nucleic acid molecule, injjarticular a recombinant 
DNA molecule or cloned gene, encoding such recq)tor recognition factor has a 
nucleotide sequence or is complementary to a DNA seqnece shown in FIGURE 14 
(SEQ ID N0:9). In still another embodiment, the receptor recognition factor has 
an amino acid sequence set forth in FIGURE 15 (SEQ ID NO: 12); preferably a 
nucleic acid molecule, in particular a recombinant DNA molecule or cloned gene, 
encoding such receptor recognition factor has a nucleotide sequence or is 
complementary to a DNA seqnece shown in FIGURE 15 (SEQ ID NO: 11). 

The possibilities both diagnostic and therapeutic that are raised by the existence of 
the receptor recognition factor or factors, derive from the fact that the factors 
appear to participate in direct and causal protein-protein interaction between the 
receptor that is occupied by its ligand, and those factors that thereafter directly 
interface with the gene and effect transcription and accordingly gene activation. 
As suggested earlier and elaborated further on herein, the present invention 
contemplates pharmaceutical intervention in the cascade of reactions in which the 
receptor recognition factor is implicated, to modulate the activity initiated by the 
stimulus bound to the cellular receptor. 
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Thus, in instances where it is desired to reduce or inhibit the gene activity 
resulting from a particular stimulus or factor, an appropriate inhibitor of the 
receptor recognition factor could be introduced to block the interaction of the 
receptor recognition factor with those factors causally connected with gene 
activation. Correspondingly, instances where insufficient gene activation is taking 
place could be remedied by the introduction of additional quantities of the receptor 
recognition factor or its chemical or pharmaceutical cognates, analogs, fragments 
and the like. 

As discussed earlier, the recognition factors or their binding partners or other 
ligands or agents exhibiting eitiier mimicry or antagonism to the recognition 
factors or control over their production, may be prepared in pharmaceutical 
compositions, with a suitable carrier and at a strength effective for administration 
by various means to a patient experiencing an adverse medical condition associated 
specific transcriptional stimulation for the treatment thereof. A variety of 
administrative techniques may be utilized, among them parenteral techniques such 
as subcutaneous, intravenous and intraperitoneal injections, catiieterizations and the 
like. Average quantities of die recognition factors or their subunits may vary and 
in particular should be based upon the recommendations and prescription of a 
qualified physician or veterinarian. 

Also, antibodies including both polyclonal and monoclonal antibodies, and drugs 
tiiat modulate the production or activity of tiie recognition factors and/or their 
subunits may possess certain diagnostic applications and may for example, be 
utilized for the purpose of detecting and/or measuring conditions such as viral 
infection or the like. For example, tiie recognition factor or its subunits may be 
used to produce both polyclonal and monoclonal antibodies to themselves in a 
variety of cellular media, by known techniques such as the hybridoma technique 
utilizing, for example, fused mouse spleen lymphocytes and myeloma cells. 
Likewise, small molecules that mimic or antagonize the activity(ies) of the 
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receptor recognition factors of the invention may be discovered or synthesized, 
and may be used in diagnostic and/or therapeutic protocols* 

The general methodology for making monoclonal antibodies by hybridomas is well 
5 known. Immortal, antibody-producing cell lines can also be created by techniques 
other than fusion, such as direct transformation of B lymphocytes with oncogenic 
DNA, or transfection with Epstein-Barr virus. See, e.g., M. Schreier et al., 
•*Hybridoma Techniques" (1980); Hammerling et al., "Monoclonal Antibodies And 
T-cell Hybridomas" (1981); Kennett et al., "Monoclonal Antibodies" (1980); see 
10 also U.S. Patent Nos. 4,341,761; 4,399,121; 4,427,783; 4,444,887; 4,451,570; 
4,466,917; 4,472,500; 4,491-,632; 4,493,890. 

Panels of monoclonal antibodies produced against recognition factor peptides can 
be screened for various properties; i.e., isotype, epitope, affinity, etc. Of 
15 particular interest are monoclonal antibodies that neutralize the activity of the 
recognition factor or its subunits. Such monoclonals can be readily identified in 
recognition factor activity assays. High affinity antibodies are also useful when 
immunoaffinity purification of native or recombinant recognition factor is possible. 

20 Preferably, the anti-recognition factor antibody used in the diagnostic methods of 
this invention is an affinity purified polyclonal antibody. More preferably, the 
antibody is a monoclonal antibody (mAb). In addition, it is preferable for the 
anti- recognition factor antibody molecules used herein be in the form of Fab, 
Fab', F(ab')2 or F(v) portions of whole antibody molecules. 

25 

As suggested earlier, the diagnostic method of the present invention comprises 
examining a cellular sample or medium by means of an assay including an 
effective amount of an antagonist to a receptor recognition factor/protein, such as 
an anti-recognition factor antibody, preferably an affinity-purified polyclonal 
30 antibody, and more preferably a mAb. In addition, it is preferable for the anti- 
recognition factor antibody molecules used herein be in the form of Fab, Fab', 
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F(ab')2 or F(v) portions or whole antibody molecules. As previously discussed, 
patients capable of benefiting from this method include those suffering from 
cancer, a pre-cancerous lesion, a viral infection or other like pathological 
derangement. Methods for isolating the recognition factor and inducing anti- 
5 recognition factor antibodies and for determining and optinriizing the ability of anti- 
recognition factor antibodies to assist in the examination of the target cells are all 
well-known in the art. 

Methods for producing polyclonal anti-polypeptide antibodies are well-known in 
10 the art. See U.S. Patent No. 4,493,795 to Nestor et al. A monoclonal antibody, 
typically containing Fab and/or F(ab')2 portions of useful antibody molecules, can 
be prepared using the hybridoma technology described in Antibodies - A 
Laboratory Manual, Harlow and Lane, eds., Cold Spring Harbor Laboratory, New 
York (1988), which is incorporated herein by reference. Briefly, to form the 
15 hybridoma from which the monoclonal antibody composition is produced, a 
myeloma or other self-perpetuating cell line is fused with lymphocytes obtained 
from the spleen of a mammal hyperimmunized with a recognition factor-binding 
portion thereof, or recognition factor, or an origin-specific DNA-binding portion 
thereof. 

20 

Splenocytes are typically fused with myeloma cells using polyethylene glycol 
(PEG) 6000. Fused hybrids are selected by their sensitivity to HAT. Hybridomas 
producing a monoclonal antibody useful in practicing this invention are identified 
by their ability to immunoreact with the present recognition factor and their ability 
25 to inhibit specified transcriptional activity in target cells. 

') 

A monoclonal antibody useful in practicing the present invention can be produced 
by initiating a monoclonal hybridoma culture comprising a nutrient medium 
containing a hybridoma that secretes antibody molecules of the appropriate antigen 
30 specificity. The culture is maintained under conditions and for a time period 
sufficient for the hybridoma to secrete the antibody molecules into the medium. 
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The antibody-containing medium is then collected. The antibody molecules can 
then be further isolated by well-known techniques. 

Media useful for the preparation of these compositions are both well-known in the 
5 art and commercially available and include synthetic culture media, inbred mice 
and the like. An exemplary synthetic medium is Dulbecco's minimal essential 
medium (DMEM; Dulbecco et al., ViroL 8:396 (1959)) supplemented with 4.5 
gm/1 glucose, 20 mm glutamine, and 20% fetal calf serum. An exemplary inbred 
mouse strain is the Balb/c. 

10 

Methods for producing monoclonal anti-recognition factof antibodies are also well- 
known in the art. See Niman et al,, Proc. Natl. Acad. ScL USA, 80:4949-4953 
(1983). Typically, the present recognition factor or a peptide analog is used either 
alone or conjugated to an immunogenic carrier, as the immunogen in the before 
15 described procedure for producing anti-recognition factor monoclonal antibodies. 
The hybridomas are screened for the ability to produce an antibody that 
immunoreacts with the recognition factor peptide analog and the present 
recognition factor. 

20 The present invention further contemplates therapeutic compositions useful in 
practicing the therapeutic methods of this invention. A subject therapeutic 
composition includes, in admixture, a pharmaceutically acceptable excipient 
(carrier) and one or more of a receptor recognition factor, polypeptide analog 
thereof or fragment thereof, as described herein as an active ingredient. In a 

25 preferred embodiment, the composition comprises an antigen capable of 

modulating the specific binding of the present recognition factor within a target 
cell. 



30 



The preparation of therapeutic compositions which contain polypeptides, analogs 
or active fragments as active ingredients is well understood in the art. Typically, 
such compositions are prepared as injectables, either as liquid solutions or 
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suspensions, however, solid forms suitable for solution in, or suspension in, liquid 
prior to injection can also be prepared. The preparation can also be emulsified. 
The active therapeutic ingredient is often mixed with excipients which are 
pharmaceutically acceptable and compatible with the active ingredient. Suitable 
5 excipients are, for example, water, saline, dextrose, glycerol, ethanol, or the like 
and combinations thereof. In addition, if desired, the composition can contain 
minor amounts of auxiliary substances such as wetting or emulsifying agents, pH 
buffering agents which enhance the effectiveness of the active ingredient. 

10 A polypeptide, analog or active fragment can be formulated into the therapeutic 
composition as neutralized pharmaceutically acceptable salt forms. 
Pharmaceutically acceptable salts include the acid addition salts (formed with the 
free amino groups of the polypeptide or antibody molecule) and which are formed 
with inorganic acids such as, for example, hydrochloric or phosphoric acids, or 

15 such organic acids as acetic, oxalic, tartaric, mandelic, and the like. Salts formed 
from the free carboxyl groups can also be derived from inorganic bases such as, 
for example, sodium, potassium, ammonium, calcium, or ferric hydroxides, and 
such organic bases as isopropylamine, trimethylamine, 2-ethylamino ethanol, 
histidine, procaine, and the like. 

20 

The therapeutic polypeptide-, analog- or active fragment-containing compositions 
are conventionally administered intravenously, as by injection of a unit dose, for 
example. The term "unit dose" when used in reference to a therapeutic 
composition of the present invention refers to physically discrete units suitable as 
25 unitary dosage for humans, each unit containing a predetermined quantity of active 
material calculated to produce the desired therapeutic effect in association with the 
required diluent; i.e., carrier, or vehicle. 

The compositions are administered in a manner compatible with the dosage 
30 formulation, and in a therapeutically effective amount. The quantity to be 
administered depends on the subject to be treated, capacity of the subject's 
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immune system to utilize the active ingredient, and degree of inhibition or 
neutralization of recognition factor binding capacity desired. Precise amounts of 
active ingredient required to be administered depend on the judgment of the 
practitioner and are peculiar to each individual. However, suitable dosages may 
5 range from about 0. 1 to 20, preferably about 0.5 to about 10, and more preferably 
one to several, milligrams of active ingredient per kilogram body weight of 
individual per day and depend on the route of administration. Suitable regimes for 
initial administration and booster shots are also variable, but are typified by an 
initial administration followed by repeated doses at one or more hour intervals by 
10 a subsequent injection or other administration. Alternatively, continuous 

intravenous infusion sufficient to maintain concentrations of ten nanomolar to ten 
micromolar in the blood are contemplated. 



The therapeutic compositions may ftirther include an effective amount of the 
15 factor/factor synthesis promoter antagonist or analog thereof, and one or more of 
the following active ingredients: an antibiotic, a steroid. Exemplary formulations 
are given below: 
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ffffinulations 



Jn^ v^nniis Form ulation I 

Ingredient 
5 cefotaxime 

receptor recognition factor 

dextrose USP 

sodium bisulfite USP 

edetate disodium USP 
10 water for injection q.s.a.d. 

Tntrav ^nous For mulation II 

Tngr^ignt 
ampicillin 
15 receptor recognition factor 
sodium bisulfite USP 
disodium edetate USP 
water for injection q.s.a.d. 

20 Intravenous Fnnn\i1ation HI 
In gredient 

gentamicin (charged as sulfate) 
receptor recognition factor 
sodium bisulfite USP 
25 disodium edetate USP 

water for injection q.s.a.d. 

Intravenous Fo rmulation IV 
In gredient 
30 recognition factor 
dextrose USP 



mg/ml 
250,0 
10,0 
45.0 
3.2 
0.1 
1.0ml 



mg/ml 
250.0 
10.0 

3,2 

0.1 

1.0ml 



mg/ml 

40.0 

10.0 

3.2 

0.1 
' 1.0ml 



^g /ml 

10.0 

45.0 
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ill 



sodium bisulfite USP 3.2 
edetate disodium USP 0. 1 

water for injection q.s.a.d. LO ml 



5 Intravenous Formulation V 

Ingrgdignt mg/mt 
recognition factor antagonist 5.0 
sodium bisulfite USP 3.2 
disodium edetate USP 0. 1 

10 water for injection q.s.a.d. LO ml 

As used herein, "pg* means picogram, "ng" means nanogram, '•ug" or "fig" mean 
microgram, "mg" means milligram, "ul" or "^r mean microliter, "ml" means 
milliliter, "1" means liter. 

15 

Another feature of this invention is the expression of the DNA sequences disclosed 
herein. As is well known in the art, DNA sequences may be expressed by 
operatively linking them to an expression control sequence in an appropriate 
expression vector and employing that expression vector to transform an 
20 appropriate unicellular host. 

Such operative linking of a DNA sequence of this invention to an expression 
control sequence, of course, includes, if not already part of the DNA sequence, ^ 
the provision of an initiation codon, ATG, in the correct reading fiume upstream 
25 of the DNA sequence. 

A wide variety of host/expression vector combinations may be employed in 
expressing the DNA sequences of this invention. Useful expression vectors, for 
example, may consist of segments of chromosomal, non-chromosomal and 
30 Synthetic DNA sequences. Suitable vectors include derivatives of SV40 and 
known bacterial plasmids, e.g., E. coli plasmids col H, pCRl, pBR322, pMB9 
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and their derivatives, plasmids such as RP4; phage DNAS, e.g., the numerous 
derivatives of phage X, e.g,, NM989, and other phage DNA, e.g,, M13 and 
Filamentous single stranded phage DNA; yeast plasmids such as the 2/i plasmid or 
derivatives thereof; vectors useful in eukaryotic cells, such as vectors useful in 
5 insect or mammalian cells; vectors derived from combinations of plasmids and 
phage DNAS, such as plasmids that have been modified to employ phage DNA or 
other expression control sequences; and the like. 

Any of a wide variety of expression control sequences ~ sequences that control the 
10 expression of a DNA sequence operatively linked to it may be used in these 
vectors to express the DNA sequences of this invention. -Such useful expression 
control sequences include, for example, the early or late promoters of SV40, 
CMV, vaccinia, polyoma or adenovirus, the lac system, the trp system, the TAC 
system, the TRC system, the LTR system, the major operator and promoter regions 
15 of phage X, the control regions of fd coat protein, the promoter for 

3-phosphoglycerate kinase or other glycolytic enzymes, the promoters of acid 
phosphatase (e.g., Pho5), the promoters of the yeast a-mating factors, and other 
sequences known to control the expression of genes of prokaryotic or eukaryotic 
cells or their viruses, and various combinations thereof. 

20 

A wide variety of unicellular host cells are also useful in expressing the DNA 
sequences of this invention. These hosts may include well known eukaryotic and 
prokaryotic hosts, such as strains of E. coliy Pseudomonas, Bacillus, 
StreptomyceSy fungi such as yeasts, and animal cells, such as CHO, Rl.l, B-W and 
25 L-M cells, African Green Monkey kidney cells (e.g., COS 1, COS 7, BSCl, 
BSC40, and BMTIO), insect cells (e.g., Sf9), and human cells and plant cells in 
tissue culture. 
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It will be understood that not all vectors, expression control sequences and hosts 
will function equally well to express the DNA sequences of this invention. 
Neither will all hosts function equally well with the same expression system. 
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However, one skilled in the art will be able to select the proper vectors, 
expression control sequences, and hosts without undue experimentation to 
accomplish the desired expression without departing from the scope of this 
invention. For example, in selecting a vector, the host must be considered 
5 because the vector must function in it. The vector's copy number, the ability to 
control that copy number, and the expression of any other proteins encoded by the 
vector, such as antibiotic markers, will also be considered. 



In selecting an expression control sequence, a variety of factors will normally be 
10 considered. These include, for example, the relative strength of the system, its 
controllability, and its compatibility with the particular DNA sequence or gene to 
be expressed, particularly as regards potential secondary structures. Suitable 
unicellular hosts will be selected by consideration of, e.g., their compatibility with 
the chosen vector, their secretion characteristics, their ability to fold proteins 
15 correctly, and their fermentation requirements, as well as the toxicity to the host 
of the product encoded by the DNA sequences to be expressed, and the ease of 
purification of the expression products. 

Considering these and other factors a person skilled in the art will be able to 
20 construct a variety of vector/expression control sequence/host combinations that 
will express the DNA sequences of this invention on fermentation or in large scale 
animal culture. 



It is further intended that receptor recognition factor analogs may be prepared 
25 from nucleotide sequences of the protein complex/subunit derived within the scope 
of the present invention. Analogs, such as fragments, may be' produced, for 
example, by pepsin digestion of receptor recognition factor material. Other 
analogs, such as muteins, can be produced by standard site-directed mutagenesis of 
receptor recognition factor coding sequences. Analogs exhibiting "receptor 
30 recognition factor activity" such as small molecules, whether functioning as 

promoters or inhibitors, may be identified by known in vivo and/or in vitro assays. 
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As mentioned above, a DNA sequence encoding receptor recognition factor can be 
prepared synthetically rather than cloned. The DNA sequence can be designed 
with the appropriate codons for the receptor recognition factor aniino acid 
sequence. In general, one will select preferred codons for the intended host if the 
5 sequence will be used for expression. The complete sequence is assembled from 
overlapping oligonucleotides prepared by standard methods and assembled into a 
complete coding sequence. See, e.g., Edge, Nature^ 292:756 (1981); Nambair et 
al,, Science, 223:1299 (1984); Jay et al., J. Biol Chem., 2JP:6311 (1984). 

10 Synthetic DNA sequences allow convenient construction of genes which will 
express receptor recognition factor analogs or "muteins".^ Alternatively, DNA 
encoding muteins can be made by site-directed mutagenesis of native receptor 
recognition factor genes or cDNAs, and muteins can be made directly using 
conventional polypeptide synthesis. 

15 

A general method for site-specific incorporation of unnatural amino acids into 
proteins is described in Christopher J. Noren, Spencer J. Anthony-Cahill, Michael 
C. Griffith, Peter G. Schultz, Science, 244:182-188 (April 1989). This method 
may be used to create analogs with unnatural amino acids. 

20 

The present invention extends to the preparation of antisense nucleotides and 
ribozymes that may be used to interfere with the expression of the receptor 
recognition proteins at the translational level. This approach utilizes antisense 
nucleic acid and ribozymes to block translation of a specific mRNA, either by 
25 masking that mRNA with an antisense nucleic acid or cleaving it with a ribozyme. 

Antisense nucleic acids are DNA or RNA molecules that are complementary to at 
least a portion of a specific mRNA molecule. (See Weintraub, 1990; 
Marcus-Sekura, 1988.) In the cell, they hybridize to that mRNA, forming a double 
30 stranded molecule. The cell does not translate an mRNA in this double-stranded 
form. Therefore, antisense nucleic acids interfere with the expression of mRNA 
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into protein. Oligomers of about fifteen nucleotides and molecules that hybridize 
to the AUG initiation codon will be particularly efficient, since they are easy to 
synthesize and are likely to pose fewer problems than larger molecules when 
introducing them into receptor recognition factor-producing cells. Antisense 
5 methods have been used to inhibit the expression of many genes in vitro 
(Marcus-Sekura, 1988; Hambor et al., 1988). 

Ribozymes are RNA molecules possessing the ability to specifically cleave other 
single stranded RNA molecules in a manner somewhat analogous to DNA 

10 restriction endonucleases. Ribozymes were discovered from the observation that 
certain mRNAs have the ability to excise their own introns. By modifying the 
nucleotide sequence of these RNAs, researchers have been able to engineer 
molecules that recognize specific nucleotide sequences in an RNA molecule and 
cleave it (Cech, 1988.)- Because they are sequence-specific, only mRNAs with 

15 particular sequences are inactivated. 

Investigators have identified two types of ribozymes, Tetrahymena'typc and 
"hammerhead"-type. (Hasselhoff and Gerlach, 1988) TetrahyTnem-typc ribozymes 
recognize four-base sequences, while "hammerheads-type recognize eleven- to 
20 eighteen-base sequences. The longer the recognition sequence, the more likely it 
is to occur exclusively in the target mRNA species. Therefore, hammerhead-type 
ribozymes are preferable to Tetrahymena-typc ribozymes for inactivating a specific 
mRNA species, and eighteen base recognition sequences are preferable to shorter 
recognition sequences. 

25 

The DNA sequences described herein may thus be used to prepare antisense 
molecules against, and ribozymes that cleave mRNAs for receptor recognition 
factor proteins and their ligands. 

30 The present invention also relates to a variety of diagnostic applications, including 
methods for detecting the presence of stimuli such as the earlier referenced 
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polypeptide ligands, by reference to their ability to elicit the activities which are 
mediated by the present receptor recognition factor. As mentioned earlier, the 
receptor recognition factor can be used to produce antibodies to itself by a variety 
of known techniques, and such antibodies could then be isolated and utilized as in 
5 tests for the presence of particular transcriptional activity in suspect target cells. 

As described in detail above, antibody(ies) to the receptor recognition factor can 
be produced and isolated by standard methods including the well known hybridoma 
techniques. For convenience, the antibody(ies) to the receptor recognition factor 
10 will be referred to herein as Abj and antibody(ies) raised in another species as 
Abj. 

The presence of receptor recognition factor in cells can be ascertained by the usual 
immunological procedures applicable to such determinations. A number of useful 

15 procedures are known. Three such procedures which are especially useful utilize 
either the receptor recognition factor labeled with a detectable label, antibody Ab^ 
labeled with a detectable label, or antibody Abj labeled with a detectable label. 
The procedures may be summarized by the following equations wherein the 
asterisk indicates that the particle is labeled, and "RRF" stands for the receptor 

20 recognition factor: 

A. RRF* + Abi = RRF*Abi 

B. RRF + Ab* = RRFAbj* 

C. RRF + Abi + Ab2* = RRFAbiAbj* 

25 The procedures and their application are all familiar to those skilled in the art and 
accordingly may be utilized within the scope of the present invention. The 
"competitive" procedure, Procedure A, is described in U.S. Patent Nos. 3,654,090 
and 3,850,752. Procedure C, the "sandwich" procedure, is described in U.S. 
Patent Nos. RE 31,006 and 4,016,043. Still other procedures are known such as 

30 the "double antibody", or "DASP" procedure. 
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In each instance, the receptor recognition factor forms complexes with one or 
more antibody(ies) or binding partners and one member of the complex is labeled 
with a detectable labeL The fact that a complex has formed and, if desired, the 
amount thereof, can be determined by known methods applicable to the detection 
5 of labels. 

It will be seen from the above, that a characteristic property of Abj is that it will 
react with Ab^ This is because Ab^ raised in one mammalian species has been 
used in another species as an antigen to raise the antibody Ab2. For example, Ab2 
10 may be raised in goats using rabbit antibodies as antigens. Ab2 therefore would be 
anti-rabbit antibody raised in goats. For purposes of this, description and claims, 
Abj will be referred to as a primary or anti-receptor recognition factor antibody, 
and Ab2 will be referred to as a secondary or anti-Abj antibody. 

15 The labels most commonly employed for these studies are radioactive elements, 
enzymes, chemicals which fluoresce when exposed to ultraviolet light, and others. 

A number of fluorescent materials are known and can be utilized as labels. These 
include, for example, fluorescein, rhodamine and auramine. A particular detecting 
20 material is anti-rabbit antibody prepared in goats and conjugated with fluorescein 
through an isothiocyanate. 

The receptor recognition factor or its binding partner(s) can also be labeled with a 
radioactive element or with an enzyme. The radioactive label can be detected by 
25 any of the currently available counting procedures. The preferred isotope may be 
selected from ^H, ^*C, ^^P, ^^S, ^Cl, ^^Cr, ^^Co, ^*Co, ^^e, ^1 ^^I, ^^^I, and 

Enzyme labels are likewise useful, and can be detected by any of the presently 
30 utilized colorimetric, spectrophotometric, fluorospectrophotometric, amperometric 
or gasometric techniques. The enzyme is conjugated to the selected particle by 
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reaction with bridging molecules such as carbodiimides, diisocyanates, 
glutaraldehyde and the like. Many enzymes which can be used in these 
procedures are known and can be utilized. The preferred are peroxidase, 
B-glucuronidase, fl-D-glucosidase, 6-D-galactosidase, urease, glucose oxidase plus 
5 peroxidase and alkaline phosphatase. U.S. Patent Nos. 3,654,090; 3,850,752; and 
4,016,043 are referred to by way of example for their disclosure of alternate 
labeling material and methods. 

A particular assay system developed and utilized in accordance with the present 
10 invention, is known as a receptor assay. In a receptor assay, the material to be 
assayed is appropriately labeled and then certain cellular "test colonies are 
inoculated with a quantity of both the labeled and unlabeled material after which 
binding studies are conducted to determine the extent to which the labeled material 
binds to the cell receptors. In this way, differences in affinity between materials 
15 can be ascertained. 

Accordingly, a purified quantity of the receptor recognition factor may be 
radiolabeled and combined, for example, with antibodies or other inhibitors 
thereto, after which binding studies would be carried out. Solutions would then be 

20 prepared that contain various quantities of labeled and unlabeled uncombined 
receptor recognition factor, and cell samples would then be inoculated and 
thereafter incubated. The resulting cell monolayers are then washed, solubilized 
and then counted in a gamma counter for a length of time sufficient to yield a 
standard error of <5%. These data are then subjected to Scatchard analysis after 

25 which observations and conclusions regarding material activity can be drawn. 
While the foregoing is exemplary, it illustrates the manner in which a receptor 
assay may be performed and utilized, in the instance where the cellular binding 
ability of the assayed material may serve as a distinguishing characteristic. 

30 An assay useful and contemplated in accordance with the present invention is 
known as a "cis/trans" assay. Briefly, this assay employs two genetic constructs, 
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one of which is typically a plasmid that continually expresses a particular receptor 
of interest when transfected into an appropriate cell line, and the second of which 
is a plasmid that expresses a reporter such as luciferase, under the control of a 
receptor/ligand complex. Thus, for example, if it is desired to evaluate a 
5 compound as a ligand for a particular receptor, one of the plasmids would be a 
construct that results in expression of the receptor in the chosen cell line, while the 
second plasmid would possess a promoter linked to the luciferase gene in which 
the response element to the particular receptor is inserted. If the compound under 
test is an agonist for the receptor, the ligand will complex with the receptor, and 

10 the resulting complex will bind the response element and initiate transcription of 
the luciferase gene. The resulting chemiluminescence is 4hen measured 
photometrically, and dose response curves are obtained and compared to those of 
known ligands. The foregoing protocol is described in detail in U.S. Patent No. 
4,981,784 and PCT International Publication No. WO 88/03168, for which 

15 purpose the artisan is referred. 

In a further embodiment of this invention, commercial test kits suitable for use by 
a medical specialist may be prepared to determine the presence or absence of 
predetermined transcriptional activity or predetermined transcriptional activity 

20 capability in suspected target cells. In accordance with the testing techniques 
discussed above, one class of such kits will contain at least the labeled receptor 
recognition factor or its binding partner, for instance an antibody specific thereto, 
and directions, of course, depending upon the method selected, e.g., 
"competitive", "sandwich", "DASP" and the like. The kits may also contain 

25 peripheral reagents such as buffers, stabilizers, etc. 

Accordingly, a test kit may be prepared for the demonstration of the presence or 
capability of cells for predetermined transcriptional activity, comprising: 

(a) a predetermined amount of at least one labeled immunochemically reactive 
30 component obtained by the direct or indirect attachment of the present receptor 
recognition factor or a specific binding partner thereto, to a detectable label; 
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(b) other reagents; and 

(c) directions for use of said kit. 

More specifically, the diagnostic test kit may comprise: 
5 (a) a known amount of the receptor recognition factor as described above (or 
a binding partner) generally bound to a solid phase to form an immunosorbent, or 
in the alternative, bound to a suitable tag, or plural such end products, etc. (or 
their binding partners) one of each; 
(b) if necessary, other reagents; and 
10 (c) directions for use of said test kit. 

In a further variation, the test kit may be prepared and used for the purposes stated 
above, which operates according to a predetermined protocol (e.g. "competitive", 
"sandwich", "double antibody", etc.), and comprises: 
15 (a) a labeled component which has been obtained by coupling the receptor 
recognition factor to a detectable label; 

(b) one or more additional immunochemical reagents of which at least one 
reagent is a ligand or an immobilized ligand, which ligand is selected from the 
group consisting of: 

20 (i) a ligand capable of binding with the labeled component (a); 

(ii) a ligand capable of binding with a binding partner of the labeled 
component (a); 

(iii) a ligand capable of binding with at least one of the component(s) to 
be determined; and 

25 (iv) a ligand capable of binding with at least \ 

one of the binding partners of at least one of the component(s) to be determined; 
and 

(c) directions for the performance of a protocol for the detection and/or 
determination of one or more components of an immunochemical reaction between 

30 the receptor recognition factor and a specific binding partner thereto. 
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In accordance with the above, an assay system for screening potential drugs 
effective to modulate the activity of the receptor recognition factor may be 
prepared. The receptor recognition factor may be introduced into a test system, 
and the prospective drug may also be introduced into the resulting cell culture, and 
5 the culture thereafter examined to observe any changes in the transcriptional 
activity of the cells, due either to the addition of the prospective drug alone, or 
due to the effect of added quantities of the known receptor recognition factor. 

PRELIMINARY CONSIDERATIONS 

10 

As mentioned earlier, the observation and conclusion underlying the present 
invention were crystallized from a consideration of the results of certain 
investigations with particular stimuli. Particularly, the present disclosure is 
illustrated by the results of work on protein factors that govern transcriptional 
15 control of EFNa-stimulated genes, as well as more recent data on the regulation of 
transcription of genes stimulated by IFN7. The following is a brief discussion of 
the role that IFN is believed to play in the stimulation of transcription taken from 
Darnell et al. THE NEW BIOLOGIST, 2(10), (1990). 

20 Activation of genes by IFNa occurs within minutes of exposure of cells to this 
factor (Lamer et al., 1984, 1986) and is strictly dependent on the IFNa binding to 
its receptor, a 49-kD plasma membrane polypeptide (Uze et al., 1990). However, 
changes in intracellular second messenger concentrations secondary to the use of 
phorbol esters, calcium ionophores, or cyclic nucleotide analogs neither triggers 

25 nor blocks EFNor-dependent gene activation (Lamer et al,, 1984; Lew et al,, 

1989), No other polypeptide, even IFN7, induces the set of inierferon-stimulated 
genes (ISGs) specifically induced by IFNa, In addition, it has been found that 
IFN7-dependent transcriptional stimulation of at least one gene in HeLa cells and 
in fibroblasts is also strictly dependent on receptor-ligand interaction and is not 

30 activated by induced changes in second messengers (Decker et al., 1989; Lew et 
al,, 1989). These highly specific receptor-ligand interactions, as well as the 
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precise transcriptional response, require the intracellular recognition of receptor 
occupation and the communication to the nucleus to be equally specific. 

The activation of ISGs by DFNa is carried out by transcriptional factor ISGF-3, or 
5 interferon stimulated gene factor 3. This factor is activated promptiy after IFNa 
treatment without protein synthesis, as is transcription itself (Lamer et al., 1986; 
Levy et al,, 1988; Levy et al., 1989). ISGF-3 binds to the ISRE, the interferon- 
stimulated response element, in DNA of the response genes (Reich et al., 1987; 
Levy et al., 1988), and this binding is affected by all of an extensive set of 

10 mutations that also affects the transcriptional function of die ISRE (Kessler et al., 
1988a). Partially purified ISGF-3 containing no other DNA-binding components 
can stimulate ISRE-dependent in vitro transcription (Fu et al., 1990), IFN- 
dependent stimulation of ISGs occurs in a cycle, reaching a peak of 2 hours and 
declining promptiy tiiereafter (Lamer et al., 1986). ISGF-3 follows tiie same 

15 cycle (Levy et al., 1988, 1989). Finally, the presence or absence or ISGF3 in a 
variety of IFN-sensitive and IFN-resistant cells correlates with the transcription of 
ISGs in these cells (Kessler et al., 1988b). 

ISGF-3 is composed of two subfractions, ISGF-3a and ISGF-37, that are found in 
20 the cytoplasm before IFN binds to its receptor (Levy et al., 1989). When cells are 
treated with IFNa, ISGF-3 can be detected in the cytoplasm within a minute, that 
is, some 3 to 4 minutes before any ISGF-3 is found in the nucleus (Levy et al., 
1989). The cytoplasmic component ISGF-37 can be increased in HeLa cells by 
pretreatment with IFN7, but IFN7 does not by itself activate transcription of ISGs 
25 nor raise the concentration of the complete factor, ISGF-3 (Levy et al., 1990). 
The cytoplasmic localization of the proteins that interact to constitute ISGF-3 was 
proved by two kinds of experiments. When cytoplasm of IFN7-treated cells that 
lack ISGF-3 was mixed with cytoplasm of IFNa-treated cells, large amounts of 
ISGF-3 were formed (Levy et al., 1989). (It was this experiment that indicated 
30 the existence of an ISGF-37 component and an ISGF-3a component of ISGF-3). 
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In addition, Dale et al. (1989) showed that enucleated cells could respond to EFNa 
by forming a DNA-binding protein that is probably the same as ISGF-3. 

The ISGF-37 component is a 48-kD protein that specifically recognizes the ISRE 
5 (Kessler et al., 1990; Fu et al., 1990). Three other proteins, presumably 

constituting the ISGF-3a component, were found in an ISGF-3 DNA complex (Fu 
et al., 1990). The entirety of roles of, or the relationships among these three 
proteins are not yet known, but it is clear that ISGF-3 is a multimeric protein 
complex. Since the binding of IFNa to the cell surface converts ISGF*3of from an 
10 inactive to an active status within a minute, at least one of the proteins constituting 
ISGF-3a must be affected promptly, perhaps by a direct interaction with the DFNa 
receptor. 

The details of how the ISGF-37 component and the three other proteins are 
15 activated by cytoplasmic events and then enter the nucleus to bind the ISRE and 
increase transcription are not entirely known. Further studies of the individual 
proteins, for example, with antibodies, are presented herein. For example, it is 
clear that, within 10 minutes of IFNa treatment, there is more ISGF*3 in the 
nucleus than in the cytoplasm and that the complete factor has a much higher 
20 affinity for the ISRE than the 48-kD ISGF-Sy component by itself (Kessler et al., 
1990). 



In summary, the attachment of interferon-a (IFN-a) to its specific cell surface 
receptor activates the transcription of a limited set of genes, termed ISGs for 

25 "interferon stimulated genes" [Lamer et al., PROa NATL. ACAD. SCL USA, 81 
(1984); Lamer et al., J. BIOL. CHEM., 261 (1986); Friedmaii et al., CELL, 38 
(1984)1), The observation that agents that affect second messenger levels do not 
activate transcription of these genes, led to the proposal that proteintprotein 
interactions in the cytoplasm beginning at the IFN receptor might act directly in 

30 transmitting to the nucleus the signal generated by receptor occupation [Levy et 
al., NEW BIOLOGIST, 2 (1991)]. 
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To test this hypothesis, the present applicants began experiments in the nucleus at 
the activated genes. Initially, the ISRE and ISGF-3 were discovered [Levy et al., 
GENES eft DEV., 2 (1988)]. 

5 Partial purification of ISGF-3 followed by recovery of the purified proteins from a 
specific DNA-protein complex revealed that the complete complex was made up of 
four proteins [Fu et al., PROC, NATL. ACAD. SCL USA, 87 (1990); Kessler et 
al., GENES & DEV., 4 (1990)]. A 48 kD protein termed ISGF-37, because 
pre-treatment of HeLa cells with IFN-7 increased its presence, binds DNA weakly 

10 on its own and Levy et al., THE EMBO. 7., 9 (1990)]. In combination 

with the JFN-a activated proteins, termed collectively the ISGF-3a proteins, the 
ISGF-37 forms a complex that binds the ISRE with a 50-fold higher affinity 
[Kessler et al., GENES & DEV., 4 (1990)]. The ISGF-3a proteins comprise a set 
of polypeptides of 113, 91 and 84 kD. All of the ISGF-3 components initially 

15 reside in the cell cytoplasm [Levy et al., GENES & DEV., 3 (1989); Dale et al., 
PROG NATL. ACAD. SCL USA, 86 (1989)]. However after only about five 
minutes of EFN-a treatment the active complex is found in the cell nucleus, thus 
confirming these proteins as a possible specific link from an occupied receptor to a 
limited set of genes [Levy et al., GENES & DEV., 3 (1989)]. 

20 

In accordance with the present invention, specific proteins comprising receptor 
recognition factors have been isolated and sequenced. These proteins, their 
fragments, antibodies and other constructs and uses thereof, are contemplated and 
presented herein. To understand the mechanism of cytoplasmic activation of the 

25 ISGF-3a proteins as well as their transport to the nucleus and interaction with 
ISGF-37, factor has been purified in sufficient quantity to obtain peptide 
sequence from each protein. Degenerate deoxy oligonucleotides that would encode 
the peptides were constructed and used in a combination of cDNA library 
screening and PGR amplification of cDNA products copied from mRNA to 

30 identify cDNA clones encoding each of the four proteins. What follows in the 
examples presented herein a description of the final protein preparations that 
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allowed the cloning of cDNAs encoding all the proteins, and the primary sequence 
of the 113 kD protein arising from a first gene, and the primary sequences of the 
91 and 84 kD proteins which appear to arise from two differently processed RNA 
products from another gene. Antisera against portions of the 84 and 91 kD 
5 proteins have also been prepared and bind specifically to the ISGF-3 DNA binding 
factor (detected by the electrophoretic mobility shift assay with cell extracts) 
indicating that these cloned proteins are indeed part of ISGF-3, The availability of 
the cDNA and the proteins they encode provides the necessary material to 
understand how the liganded IFN-a receptor causes immediate cytoplasmic 

10 activation of the ISGF-3 protein complex, as well as to understand the mechanisms 
of action of the receptor recognition factors contemplated herein. The cloning of 
each of ISGF3-a proteins, and the evaluation and confirmation of the particular 
role played by the 91 kD protein as a messenger and DNA binding protein in 
response to IFN-7 activation, including the development and testing of antibodies 

15 to the receptor recognition factors of the present invention, are all presented in the 
examples that follow below. 

EXAMPLE 1 

20 To purify relatively large amounts of ISGF-3, HeLa cell nuclear extracts were 
prepared from cells treated overnight (16-18 h) with 0.5 ng/ml of IFN-7 and 45 
min. with IFN-a (500u/ml). The steps used in the large scale purification were 
modified slightly fi*om those described earlier in the identification of the four 
ISGF-3 proteins, 

25 

Accordingly, nuclear extracts were made from superinduced HeLa cells [Levy et 
al., THE EMBO, 9 (1990)] and chromatographed as previously described [Fu 
et al,, PROa NATL. ACAD. SCI USA, 87 (1990)] on: phosphoceUulose P-11, 
heparin agarose (Sigma); DNA cellulose (Boehringer Mannheim; flow through was 
30 collected after the material was adjusted to 0.28M KCl and 0.5% NP-40); two 
successive rounds of ISRE oligo affinity colunrn (L8 ml column, eluted with a 
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linear gradient of 0.05 to l.OM KCl); a point mutant ISRE oligonucleotide affinity 
column (flow through was collected after the material was adjusted to 0.28M 
KCl); and a final round on the ISRE oligonucleotide column (material was eluted 
in a linear 0.05 to l.OM NaCl gradient adjusted to 0.05% NP-40), Column 
5 fractions containing ISGF-3 were subsequently examined for purity by SDS 
PAGE/sUver staining and pooled appropriately. The pooled fractions were 
concentrated by a centricon-10 (Amicon). The pools of fractions from 
preparations 1 and 2 were combined and run on a 10 cm wide, L5 mm thick 
7.5% SDS polyacrylamide gel. The proteins were electroblotted to nitrocellulose 

10 for 12 hrs at 20 volts in 12.5% MeOH, 25mM Tris, 190 mM glycine. The 

membrane was stained with -0, 1 % Ponceau Red (in 1 % acetic acid) and the bands 
of 113 kD, 91 kD, 84 kD, and 48 kD excised and subjected to peptide analysis 
after tryptic digestion [Wedrychowski et al., 7. BIOL. CHEM,, 265 (1990); 
Aebersold et al., PROC. NATL. ACAD, SCL USA, 84 (1987)]. The resulting 

15 peptide sequences for the 91 kD and 84 kD proteins are indicated in Fig. 6. 
Degenerate oligonucleotides were designed based on the peptide sequences tl9, 
tl3b and t27: (Forward and Reverse complements are denoted by F and R: 

19F AACGTIGACCAATTNAACATG (SEQ ID NO: 14) 

20 T T GC T 



T 

13bR GTCGATGTTNGGGTANAG (SEQ ID NO: 15) 

25 A A A A A 

27R GTACAAITCAACCAGNGCAA (SEQ ID NO: 16) 

T TG T T 

'( 

30 

The final ISRE oligonucleotide affinity selection yielded material with the SDS 
polyacrylamide gel electrophoretic pattern shown in Fig. 4 (left). This gel 
represented about 1.5% of the available material purified from over 200 L of 
appropriately treated HeLa cells. While 113, 91, 84 and 48 kD bands were 
35 clearly prominent in the final purified preparation (see Fig. 4, right panel), there 
were also two prominent contaminants of about 118 and 70 kD and a few of other 
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contaminants in lower amounts. [Amino acid sequence data have shown that the 
contaminants of 86 kD and 70 kD are the KU antigen, a widely-distributed protein 
that binds DNA termini. However in the specific ISGF-3: ISRE complex there is 
no KU antigen and therefore it has been assigned no role in IFN^dependent 
5 transcriptional stimulation, [Wedrychowski et al., 7. BIOL. CHEM., 265 (1990)]]. 

Since the mobility of the 113, 91, 84, and 48 kD proteins could be accurately 
marked by comparison with the partially purified proteins characterized in 
previous experiments [Fu et al., PROC. NATL, ACAD. SCL USA, 87 (1990)], 

10 further purification was not attempted at this stage. The total purified sample 
from 200 L of HeLa cells was loaded onto one gel, subjected to electrophoresis, 
transferred to nitrocellulose and stained with Ponceau red. The 113, 84, 91, and 
48 kD protein bands were separately excised and subjected to peptide analysis as 
described [Aebersold et al., PROC. NATL. ACAD. SCL USA, 84 (1987)]. 

15 Released peptides were collected, separated by HPLC and analyzed for sequence 
content by automated Edman degradation analysis. 

Accordingly, the use of the peptide sequence data for three of four peptides from 
the 91 kD protein and a single peptide derived from the 84 kD protein is described 

20 herein. The peptide sequence and the oligonucleotides constructed from them are 
given in the legend to Fig. 4 or 6. When oligonucleotides 19F and 13bR were 
used to prime synthesis from a HeLa cell cDNA library, a PGR product of 475 bp 
was generated. When this product was cloned and sequenced it encoded the 13a 
peptide internally. Oligonucleotide 27R derived from the only available 84 kD 

25 peptide sequence was used in an anchored PGR procedure amplifying a 405 bp 
segment of DNA. This 405 bp amplified sequence was identical to an already 
sequenced region of the 91 kD protein. It was then realized that the peptide t27 
sequence was contained within peptide tl9 and that the 91 and 84 kD proteins 
must be related (see Fig. 5 & 7). Oligonucleotides 19F and 13a were also used to 

30 select candidate cDNA clones from a cDNA library made from mRNA prepared 
after 16 hr. of IFN-7 and 45 min. of IFN-a treatment. 
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Of the numerous cDNA clones that hybridized these oligonucleotides and also the 
cloned PGR products, one cDNA clone, E4, contained the largest open reading 
frame flanked by inframe stop codons. Sequence of peptides tl9, tl3a, and tl3b 
were contained in this 2217 bp ORF (see Fig. 6) which was sufficient to encode a 
5 protein of 739 amino acids (calculated molecular weight of 86 kD). The codon for 
the indicated initial methionine was preceded by three in frame stop codons. This 
coding capacity has been confirmed by translating in vitro an RNA copy of the E4 
clone yielding product of nominal size of 86 kD, somewhat shorter than the in 
vitro purified 91 kD protein discussed earlier (data not shown). Perhaps this result 
10 indicates post-translational modification of the protein in the cell, 

A second class of clones was also identified (see Fig. 5), E3, the prototype of this 
class was identical to E4 from the 5' end to bp 2286 (aa 701) at which point the 
sequences diverged completely. Both cDNAs terminated with a poIy(A) tail. 

15 Primer extension analysis suggested another ^ 150 bp were missing from the 5' 
end of both mRNAs. DNA probes were made from the clones representing both 
common and unique sequences for use in Northern blot analyses. The preparation 
of the probes is as follows: 20 mg of cytoplasmic RNA (0.5% NP-40 lysate) of 
IFN-a treated (6 h) HeLa RNA was fractionated in a 1% agarose, 6% 

20 formaldehyde gel (in 20 mM MOPS, 5mM NaAc, 1 mM EDTA, pH 7.0) for 4.5 
h at 125 volts. The RNA was transferred in 20 x SSC to Hybond-N (Amersham), 
UV crosslinked and hybridized with 1x106 cpm/ml of the indicated probes 
(1.5x10^ cpm/mg). 

25 Probes from regions common to E3 and E4 hybridized to two ^NA species of 
approximately 3.1 KB and 4.4 KB. Several probes derived from the 3' 
non-coding end of E4, which were unique to E4, hybridized only the larger RNA 
species. A labeled DNA probe from the unique 3' non-coding end of E3 
hybridized only the smaller RNA species. 



30 
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Review of the sequence at the site of 3' discontinuity between E3 and E4 
suggested that the shorter mRNA results from choice of a different poly(A) site 
and 3' exon that begins at bp 2286 (the calculated molecular weight from the E3, 
The last two nucleotides before the change are GT followed by GT in E3 in line 
5 with the consensus nucleotides at an exon-intron junction. Since the ORF of E4 
extends to bp 2401 it encodes a protein that is 38 amino acids longer than the one 
encoded by E3, but is otherwise identical (ORF is 82 kD). 

Since there is no direct assay for the activity of the 91 or 84 kD protein, an 

10 independent method was needed to determine whether the cDNA clones we had 
isolated did indeed encode proteins that are part of ISGF-3. For this purpose 
antibodies were initially raised against the sequence from amino acid 597 to amino 
acid 703 (see Fig. 6) by expressing this peptide in the pGEX-3X vector (15) as a 
bacterial fusion protein. This antiserum (a42) specifically recognized the 91 kD 

15 and 84 kD proteins in both crude extracts and purified ISGF-3 (see Fig. 7a). 
More importantly this antiserum specifically affected the ISGF-3 band in a 
mobility shift assay using the labeled ISRE oligonucleotide (see Fig. 7b) 
confirming that the isolated 91 kD and 84 kD cDNA clones (E4 and E3) represent 
a component of ISGF-3. Additional antisera were raised against the amino 

20 terminus and carboxy terminus of the protein encoded by E4, The amino terminal 
59 amino acids that are common to both proteins and the unique carboxy terminal 
34 amino acids encoded only by the larger mRNA were expressed as fusion 
proteins in pGEX-3X for immunization of rabbits. Western blot analysis with 
highly purified ISGF-3 demonstrated that the amino terminal antibody (a55) 

25 recognized both the 91 kD and 84 kD proteins as expected. However, the other 
antibody (a57) recognized only the 91 kD protein confinning our assumption that 
the larger mRNA (4.4 KB) and larger cDNA encodes the 91 kD protein while the 
shorter mRNA (3.1 KB) and cDNA encodes the 84 kD protein (see Fig. 7a), 
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EXAMPLE 2 
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In this example, the cloning of the 113 kD protein that comprises one of the three 
ISGF-3a components is disclosed. 

From SDS gels of highly purified ISGF-3, the 113 kD band was identified, 
5 excised and subjected to cleavage and peptide sequence analysis [Aebersold et al., 
PROC. NATL. ACAD, SCI USA, 87 (1987)]. Five peptide sequences (A-E) were 
obtained (Fig. 8A). Degenerate oligonucleotide probes were designed according to 
these peptides which then were radiolabeled to search a human cDNA library for 
clones that might encode the 113 kD protein. Eighteen positive cDNA clones 

10 were recovered from 2,5 x 10^ phage plaques with the probe derived from peptide 
E (Fig. 8A, and the legend)." Two of them were completely sequenced. Clone fll 
contained a 3.2 KB cDNA, and clone ka31 a 2.6 KB cDNA that overlapped about 
2 KB but which had a further extended 5* end in which a candidate AUG initiation 
codon was found associated with a well-conserved Kozak sequence [Kozak, 

15 NUCLEIC ACIDS RES. , 12 (1984)]. 

In addition to the phage cDNA clones, a PGR product made between 
oligonucleotides that encoded peptide D and E also yielded a 474 NT fragment 
that when sequenced was identical with the cDNA clone in this region. A 

20 combination of these clones fll and ka31 revealed an open reading frame capable 
of encoding a polypeptide of 851 amino acids (Fig. 8 A). These two clones were 
joined within their overlapping region and RNA transcribed from this recombinant 
clone was translated in vitro yielding a polypeptide that migrated in an SDS gel 
with a nominal molecular weight of 105 kD (Fig. 9A). An appropriate clone 

25 encoding the 91 kD protein was also transcribed and the RNA translated in the 
same experiment. Since both the apparently complete cDNA clones for the 113 
kD protein and the 91 kD protein produce RNAs that when translated into proteins 
migrate somewhat faster than the proteins purified as ISGF-3 components, it is 
possible that the proteins undergo post-translational modification in the cell causing 

30 them to be slightly retarded during electrophoresis. When a 660 bp cDNA 
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encoding the most 3' end of the 113 kD protein was used in a Northern analysis, a 
single 4.8 KB mRNA species was observed (Figure 9B). 

No independent assay is known for the activity of the 113 kD (or indeed any of 
5 the ISGF-3a proteins,) but it is known that the protein is part of a DNA binding 
complex that can be detected by an electrophoretic mobility shift assay [Fu et al., 
PROa NATL. ACAD. SCL USA, 87 (1990)]. Antibodies to DNA binding 
proteins are known to affect the formation or migration of such complexes. 
Therefore antiserum to a polypeptide segment (amino acid residues 323 to 527) 

10 fused with bacterial glutathione synthetase [Smith et al., PROC. NATL, ACAD. 
SCL USAy 83 (1986)] was raised in rabbits to determine^the reactivity of the 
ISGF-3 proteins with the antibody* A Western blot analysis showed that the 
antiserum reacted predominantly with a 113 kD protein both in the ISGF3 fraction 
purified by specific DNA affinity chromatography (Lane 1) and in crude cell 

15 extract (Lane 2, Fig, lOA). The weak reactivity to lower protein bands was 
possibly due to 113 kD protein degradation. Most importantly, the antiserum 
specifically removed almost all of the gel-shift complex leaving some of the 
oligonucleotide probe in "shifted-shift" complexes which were specifically 
competed away with a 50 fold molar excess of the oligonucleotide binding site (the 

20 ISRE, ref. 2) for ISGF3 (Fig. lOB). Notably, this antiserum had no effect on the 
faster migrating shift band produced by ISGF3-7 component alone (Figure lOB). 
Thus it appeared that the antiserum to the 113 kD fusion product does indeed react 
with another protein that is part of the complete ISGF-3 complex. 

25 A detailed sequence comparison between the 113 and 91 sequences followed (Fig. 
8B): while the nucleotide sequence showed only a distant relationship between the 
two proteins, there were long stretches of amino acid identity. These conserved 
regions were scattered throughout almost the entire 715 amino acid length encoded 
by the 91/84 clone. It was particularly striking that the regions corresponding to 

30 amino acids 1 to 48 and 317 to 353 and 654 to 678 in the 113 sequence were 60% 
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to 70% identical to corresponding regions of the 91 kD sequence. Thus the genes 
encoding the 113 and 84/91 proteins are closely related but not identical. 

Through examination for possible consensus sequences that might reveal 
5 sub-domain structures in the 113 kD or 84/91 kD sequence, it was found that both 
proteins contained regions whose sequence might form a coil structure with heptad 
leucine repeats. This occurred between amino acid 210 and 245 in the 113 kD 
protein and between 209 and 237 in the 84/91 protein. In both the 113 kD and the 
91/84 kD sequences, 4 out of 5 possible heptad repeats Were leucine and one was 

10 valine. Domains of this type might provide a protein surface that encourages 

homo-or heterotypic protein interactions which have been observed in several other 
transcription factors [Vinson et al., SCIENCE, 246 (1989)], An extended acidic 
domain was located at the carboxyl terminal of the 113 kD protein but not in 91 
kD protein (Fig, 8 A), possibly implicating the 113 kD protein in gene activation 

15 [Hope et al., Ma et al., CELL, 48 (1987)]. 

DISCUSSION 

When compared at moderate or high stringency to the Genbank and EMBL data 
bases, there were no sequences like 113 or the 84/91 sequence. Preliminary PCR 

20 experiments however indicate that there are other family members with different 
sequences recoverable from a human cell cDNA library (Qureshi and Darnell 
unpublished). Thus, it appears that the 113 and 84/91 sequences may represent 
the first two members to be cloned of a larger family of proteins. We would 
hypothesize that the 113 kD and 84/91 kD proteins may act as signal transducers, 

25 somehow interacting with the internal domain of a liganded IFNa receptor or its 
associated protein and further that a family of waiting cytoplasmic proteins exist 
whose purpose is to be specific signal transducers when different receptors are 
occupied. Many experiments lie ahead before this general hypothesis can be 
crucially tested. Recent experiments have indicated that inhibitors of protein 

30 kinases can prevent ISGF-3 complex formulation [Reich et al., PROC. NATL. 
ACAD, SCL USA, 87 (1990); Kessler et al,, /. BIOL. CHEM., 266 (1991)]. 
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However, neither the IFNa or IFN7 receptors that have so far been cloned have 
intrinsic kinase activity [Uze et al., CELL, 60 (1990); Aguet et al., CELL, 55 
(1988)]. We would speculate that either a second receptor chain with kinase 
activity or a separate kinase bound to a liganded receptor could be a part of a 
5 complex that would convey signals to the ISGF-3a proteins at the inner surface of 
the plasma membrane. 

From the above, it has been concluded that accurate peptide sequence from 
ISGF-3 protein components have been determined, leading to correct identification 

10 of cDNA clones encoding the 113, 91 and 84 kD components of ISGF-3 . Since 
staurosporine, a broadly effective kinase inhibitor blocks TFN-a induction of 
transcription and of ISGF-3 formation [Reich et al., PROC NATL. ACAD. SCL 
USA, 87 (1990); Kessler et al,, BIOL, CHEM,, 266 (1991)] it seems possible 
that the ISGF-3of proteins are direct cytoplasmic substrates of a liganded 

15 receptor-associated kinase. The antiserum against these proteins should prove 
invaluable in identifying the state of the ISGF-3a proteins before and after IFN 
treatment and will allow the direct exploration of the biochemistry of signal 
transduction from the IFN receptor. 

20 EXAMPLE 3 

As mentioned earlier, the observation and conclusion underlying the present 
invention were crystallized from a consideration of the results of certain 
investigations with particular stimuli. Particularly, the present disclosure is 
25 illustrated by the results of work on protein factors that govern ■transcriptional 

control of IFNa-stimulated genes, as well as more recent data on the regulation of 
transcription of genes stimulated by IFN7, 

For example, there is evidence that the 91 kD protein is the tyrosine kinase target 
30 when IFNy is the ligand. Thus two different ligands acting through two different 
receptors both use these family members. With only a modest number of family 
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members and combinatorial use in response to different ligands, this family of 
proteins becomes an even more likely possibility to represent a general link 
between ligand-occupied receptors and transcriptional control of specific genes in 
the nucleus. 

5 

Further study of the 1 13, 91 and 84 kD proteins of the present invention has 
revealed that they are phosphorylated in response to treatment of cells with IFNa 
(Figure 11). Moreover, when the phosphoamino acid is determined in the newly 
phosphorylated protein the amino acid has been found to be tyrosine (Fig. 12). 
10 This phosphorylation has been observed to disappear after several hours, indicating 
action of a phosphatase of the 113, 91 and 84 kD proteins to stop transcription. 
These results show that EFN "dependent transcription very' likely demands this 
hI particular phosphorylation and a cycle of interferon-dependent phosphorylation- 

le dephosphorylation is responsible for controlling transcription. 

15 

4^ It is proposed that other members of the 113-91 protein family will be identified as 

phosphorylation targets in response to other ligands. If as is believed, the tyrosine 
rj phosphorylation site on proteins in this family is conserved, one can then easily 

Ip determine which family members are activated (phosphorylated), and likewise the 

A 20 particular extracellular polypeptide ligand to which that family member is 

responding. The modifications of these proteins (phosphorylation and 
dephosphorylation) enables the preparation and use of assays for determining the 
effectiveness of pharmaceuticals in potentiating or preventing intracellular 
responses to various polypeptides, and such assays are accordingly contemplated " 
25 within the scope of the present invention. 

EXAMPLE 4 
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Identification of murine 91 kD protein 

A fragment of the gene encoding the human 91 kD protein was used to screen a 
murine thymus and spleen cDNA library for homologous proteins. The screening 
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assay yielded a highly homologous gene encoding a murine polypeptide that is 
greater than 95% homologous to the human 91 kD protein , The nucleic acid and 
deduced amino acid sequence of the murine 91 kD protein are shown in Figure 
12A-12C, and SEQ ID NO:7 (nucleotide sequence) and SEQ ID NO: 8 (amino acid 
5 sequence). 

EXAMPLE 5 

Additional Members of The 113-91 Protein Family 

10 Using a 300 nuclide fragment amplified by PGR from the SH2 region of the 

murine 91kD protein gene, murine genes encoding two additional members of the 
113-91 family of receptor recognition factor proteins were isolated from a murine 
splenic/thymic cDNA library according to the method of Sambrook et al. (1989, 
Molecular Cloning, A Laboratory Manual, 2nd. ed., Cold Spring Harbor Press: 

15 Cold Spring Harbor, New York) constructed in the ZAP vector. Hybridization 
was carried out at 42°C and washed at 42°C before the first exposure (Church and 
Gilbert, 1984, Proc, Natl, Acad, Sci. USA 81:1991-95). Then the filters were 
washed in 2X SSC, 0.1% SDS at 65*C for a second exposure. Statl clones 
survived the 65 "^C washing, whereas Stat3 and Stat4 clones were identified as 

20 plaques that lost signals at eS^'C. The plaques were purified and subcloned 
according to Stratagene commercial protocols. 

This probe was chosen to screen for other STAT family members because, while 
Statl and Stat2 SH2 domains are quite similar over the entire 100 to 120 amino 
25 acid region, only the amino tenninal half of the STAT SH2 domains strongly 
resemble the SH2 regions found in other proteins. 

The two genes have been cloned into plasmids 13sfl and 19sf6. The nucleotide 
sequence, and deduced amino acid sequence, for the 13sfl and 19sf6 genes are 
30 shown in Figures 14 and 15, respectively. These proteins are alternatively termed 
Stat4 and Stat3, respectively. 
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Comparison with the sequence of Stat91 (Statl) and Statll3 (Stat2) shows several 
highly conserved regions, including the putative SH3 and SH2 domains. The 
conserved amino acid stretches likely point to conserved domains that enable these 
proteins to carry out transcription activation functions. Stat3, like Statl (Stat91), 
5 is widely expressed, while Stat4 expression is limited to the testes, thymus, and 
spleen. Stat3 has been found to be activated as a DNA binding protein through 
phosphorylation on tyrosine in cells treated with EGF or IL-6, but not after IFN-7, 
treatment. 

10 Both the 13sfl and 19sf6 genes share a significant homology with the genes 

encoding the human and murine 91 kD protein. There is corresponding homology 
between the deduced amino acid sequences of the 13sfl and 19sf6 proteins and the 
amino acid sequences of the human and murine 91 kD proteins, although not the 
greater than 95% amino acid homology that is found between the murine and 

15 human 91 kD proteins. Thus, though clearly of the same family as the 91 kD 
protein, the 13sfl and 19sf6 genes encode distinct proteins. 

The chromosomal locations of the murine STAT proteins (1-4) have been 
determined: Statl and Stat4 are located in the centromeric region of mouse 
20 chromosome 1 (corresponding to human 2q 32-34q); the two other genes are on 
other chromosomes. 

Southern analysis using probes derived from 13sfl and 19sf6 on human genomic 
libraries have established that genes corresponding to the murine 13sfl and 19sf6 
25 genes are found in humans. 

f 

Tissue distribution of mRNA expression of these genes was evaluated by Northern 
hybridization analysis. The results of this distribution analysis are shown in the 
following Table. 

30 

TABLE 
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DISTRIBUTION OF mRNA EXPRESSION OF 13sfl. 19sf6, 91 kD PROTEINS 



5 



ORGAN 


13sfl 


19sf6 


91 KD 


BRAIN 


- 


+ 


- 


HEART 


- 


+ + + 


- 


KIDNEY 


- 


- 


- 


LIVER 


- 


+ 


+ 


LUNG 


- 


- 


- 


SPLEEN 


+ 


+ 


+ + + + 


TESTIS 


+ + + + 


+ + 


N.A. 


THYMUS 


+ + 


+ + 


+ + + 


EMBRYO (16d) 


not found 


found 


found 



Northern analysis demonstrates that there is variation in the tissue distribution of 
15 expression of the mRNAs encoded by these genes. The variation and tissue 

distribution indicates that the specific genes encode proteins that are responsive to 
different factors, as would be expected in accordance with the present invention. 
The actual ligand, the binding of which induces phosphorylation of the newly 
discovered factors, will be readily determinable based on the tissue distribution 
20 evidence described above. 

] 

To determine whether the Stat3 and Stat4 proteins were present in cells, protein 
blots were carried out with antisera against each protein. The antisera were 
obtained by subcloning amino acids 688 to 727 of Stat3 and 678 to 743 of Stat4 to 
25 pGEXlXt (Pharmacia) by PGR with oligonucleotides based on the boundary 
sequence plus restriction sites (BamHI at the 5' end and EcoRI at the 3' end), 
allowing for in-frame fusion with GST. One milligram of each antigen was used 
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for the immunization and three booster Injections were given 4 weeks apart. Anti- 
Stat3 and anti-Stat4 sera were used 1:1000 in Western blots using standard 
protocols. To avoid cross reactivity of the antisera, antibodies were raised against 
the C-tenninal of Stat3 and Stat4, the less homologous region of the protein. 

5 

These proteins were unambiguously found in several tissues where the mRNA wan 
known to be present. Protein expression was checked in several cell lines as well. 
A protein of 89 kD reactive with Stat4 antiserum was expressed in 70Z cells, a 
preB cell line, but not in many other cell lines. Stat3 was highly expressed, 
10 predominantly as a 97 kD protein, in 70Z, HT2 (a mouse helper T cell clone), and 
U937 (a macrophage-derived cell). 

To prove that the full length functional cDNA clones of StatS and Stat4 were 
obtained, the open reading frames of each cDNA was independently (Le,y 

15 separately) cloned into the Rc/CMV expression vector (Invitrogen) downstream of 
a CMV promoter. The resulting plasmids were transfected into COSl cells and 
proteins were extracted 60 hrs post-transfection and examined by Western blot 
after electrophoresis. Untransfected COSl cells expressed a low level of 97 kD 
StatS protein but did not express a detectable level of Stat4. Upon transfection of 

20 the StatS-expressing plasmid, the 97 kD StatB was increased at least 10-fold, And 
89 kD protein antigenically related to StatS, found as a minor band in most cell 
line extracts, was also increased post-transfection. This protein therefore appears 
to represent another form of StatS protein, or an antigenically similar protein 
whose synthesis is stimulated by StatS, Transfection with Stat4 led to the 

25 expression of a 89 kD reactive band indistinguishable in size form the p89 Stat4 
found in 70Z cell extracts, '» 

DISCUSSION 

30 As mentioned earlier, the observation and conclusion underlying the present 
invention were crystallized from a consideration of the results of certain 
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investigations with particular stimuli. Particularly, the present disclosure is 
illustrated by the results of work on protein factors that govern transcriptional 
control of IFNa-stimulated genes, as well as more recent data on the regulation of 
transcription of genes stimulated by IFN7. The present disclosure is further 
5 illustrated by the identification of related genes encoding protein factors responsive 
to as yet unknown factors. It is expected that the murine 91 kD protein is 
responsive to IFN-7. 

For example, the above represents evidence that the 91 kD protein is the tyrosine 
10 kinase target when IFN7 is the ligand. Thus two different ligands acting through 
two different receptors both -use these family members. ^Vith only a modest 
number of family members and combinatorial use in response to different ligands, 
this family of proteins becomes an even more likely possibility to represent a 
general link between ligand-occupied receptors and transcriptional control of 
15 specific genes in the nucleus. 

It is proposed and shown by the foregoing that other members of the 113-91 
protein family will be and have been identified as phosphorylation targets in 
response to other ligands. If as is believed, the tyrosine phosphorylation site on 

20 proteins in this family is conserved, one can then easily determine which family 
members are activated (phosphorylated), and likewise the particular extracellular 
polypeptide ligand to which that family member is responding. The modifications 
of these proteins (phosphorylation and dephosphorylation) enables the preparation 
and use of assays for determining the effectiveness of pharmaceuticals in 

25 potentiating or preventing intracellular responses to various polypeptides, and such 
assays are accordingly contemplated within the scope of the present invention. 

Earlier work has concluded that DNA binding protein was activated in the cell 
cytoplasm in response to IFN-7 treatment and tiiat this protein stimulated 
30 transcription of the GBP gene (10,14), In the present work, with the aid of 
antisera to proteins originally studied in connection with IFN-a gene stimulation 
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(7,12,15), the 91 kD ISGF-3 protein has been assigned a prominent role in IFN-y 
gene stimulation as well. The evidence for this conclusion included: 1) antisera 
specific to the 91 kD protein affected the IFN-7 dependent gel-shift complex, and 
2) A 91 kD protein could be cross-linked to the GAS IFN-7 activated site. 3) A 
5 ^^S-labeled 91 kD protein and a 91 kD immunoreactive protein specifically purified 
with the gel-shift complex. 4) The 91 kD protein is an IFN-7 dependent tyrosine 
kinase substrate as indeed it had earlier proved to be in response to IFN-or (15). 
5) The 91 kD protein but not the 1 13 kD protein moved to the nucleus in response 
to IFN-7 treatment. None of these experiments prove but do strongly suggest that 
10 the same 91 kD protein acts differently in different DNA binding complexes that 
are triggered by either IFN-cy or IFN-7. 



These results strongly support the hypothesis originated from studies on EFN-a that 
polypeptide cell surface receptors report their occupation by extracellular ligand to 

15 latent cytoplasmic proteins that after activation move to the nucleus to trigger 
transcription (4,15,21). Furthermore, because cytoplasmic phosphorylation and 
factor activation is so rapid it appears likely that the functional receptor complexes 
contain tyrosine kinase activity. Since the IFN-7 receptor chain that has been 
cloned thus far (22) has no hint of possessing intrinsic kinase activity, perhaps 

20 some other molecule with tyrosine kinase activity couples with the IFN-7 receptor. 
Two recent results with other receptors suggest possible parallels to the situation 
with the IFN receptors. The trk protein which has an intracellular tyrosine kinase 
domain, associates with the NOP receptor when that receptor is occupied (23). In 
addition, the Ick protein, a member of the src family of tyrosine kinases, is 

25 co-precipitated with the T cell receptor (24). It is possible to predict that signal 
transduction to the nucleus through these two receptors could involve latent 
cytoplasmic substrates that form part of activated transcription factors. In any 
event, it seems possible that there are kinases like trk or Ick associated with the 
IFN-7 receptor or with IFN-a receptor. 
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With regard to the effect of phosphorylation on the 91 kD protein, it was 
something of a surprise that after IFN-7 treatment the 91 kD protein becomes a 
DNA binding protein. Its role must be different in response to IFN-a treatment. 
Tyrosine is also phosphorylated on tyrosine and joins a complex with the 1 13 and 
5 84 kD proteins but as judged by UV cross-linking studies (7), the 91 kD protein 
does not contact DNA. 

In addition to becoming a DNA binding protein it is clear that the 91 kD protein is 
specifically translocated the nucleus in the wake of IFN-7 stimulation. 

10 

EXAMPLE : DIMERIZATION OF PHOSPHORYLATED STAT91 

Stat91 (a 91 kD protein that acts as a jsignal transducer and activator of 
Jranscription) is inactive in the cytoplasm of untreated cells but is activated by 

15 phosphorylation on tyrosine in response to a number of polypeptide ligands 
including IFN-a and IFN-7. This example reports that inactive Stat91 in the 
cytoplasm of untreated cells is a monomer and upon IFN-7 induced 
phosphorylation it forms a stable homodimer. The dimer is capable of binding to 
a specific DNA sequence directing transcription. Dissociation and reassociation 

20 assays show that dimerization of Stat91 is mediated through SH2-phosphotyrosyl 
peptide interactions. Dimerization involving SH2 recognition of specific 
phosphotyrosyl peptides may well provide a prototype for interactions among 
family members of STAT proteins to form different transcription complexes and 
Jak2 for the IFN-7 pathway (42, 43, 44). These kinases themselves become 

25 tyrosine phosphorylated to carry out specific signaling events. 

Materials and Methods 

Cell Culture. Human 2fTGH, USA cells were maintained in DMEM medium 
30 supplied with 10% bovine calf serum. U3A cell lines supplemented with various 
Stat91 protein constructs were maintained in 0.1 mg/ml G418 (Gibco, BRL). 
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Stable cell lines were selected as described (45). IFN-7(5 ng/ml, gift from 
Amgen) treatment of cells was for 15 min. unless otherwise noted. 

Plasmid CoTistructions. Expression construct MNC-84 was made by insertion of 
5 the cDNA into the Not I-Bam HI cloning site of an expression vector PMNC (45, 
35). MNC-91L was made by insertion of the Stat91 cDNA into the Not I -Bam 
HI cloning sites of pMNC without the stop codon at the end, resulting the 
production of a long form of Stat91 with a C-terminal tag of 34 amino acids 
encoded by PMNC vector. 

10 

GST fusion protein expression plasmids were constructed by the using the pGEX- 
2T vector (Pharmacia), GST-91SH2 encodes amino acids 573 to 672 of Stat91; 
GST-91mSH2 encodes amino acids 573 to 672 of Stat91 with an Arg-602-> Leu- 
602 mutation; and GST-91SH3 encodes amino acids 506 to 564 of Stat91. 

15 

DNA Transfection. DNA transfection was carried by the calcium phosphate 
method, and stable cell lines were selected in Dulbecco's modified Eagle's 
medium containing G418 (0.5 mg/ml, Gibco), as described (45). 

20 Preparation of Cell Extracts, Crude whole cell extracts were prepared as 

described (31). Cytoplasmic and nuclear extracts were prepared essentially as 
described (46). 

Affinity Purification. Affinity purification with a biotinylated oligonucleotide was 
25 described (31). The sequence of the biotinylated GAS oligonucleotide was from 
the Ly6E gene promoter (34). 

Nondenaturing Pofyacrylamide Gel Analysis. A nondenatured protein molecular 
weight marker kit with a range of molecular weights from 14 to 545 kD was 
30 obtained from Sigma. Determining molecular weights using nondenaturing 

polyacrylamide gel was carried out following the manufacturer's procedure, which 
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is a modification of the methods of Bryan and Davis (47, 48). Phosphorylated and 
unphosphorylated Stat91 samples obtained from affinity purification using a 
biotinylated GAS oligonucleotide (31) were resuspended in a buffer containing 10 
mM Tris (pH 6.7), 16% glycerol, 0.04% bromphenol blue (BPB). The mixtures 
5 were analyzed on 4.5%, 5.5%, 6,5%, and 7.5.% native gels side by side with 
standard markers using a Bio-Rad mini-Protean II Cell electrophoresis system. 
Electrophoresis was stopped when the dye (BPB) reached the bottom of the gels. 
The molecular size markers were revealed by Coomassie blue staining. 
Phosphorylated and unphosphorylated Stat91 samples were detected by 
10 immunoblotting with anti-91T. 

Glycerol Gradient Analysis. Cells extracts (Bud 8) were mixed with protein 
standards (Pharmacia) and subjected to centrifugation through preformed 10%- 
40% glycerol gradients for 40 hours at 40,0(X) rpm in an SW41 rotor as described 
15 (6). 

Gel Mobility Shift Assays. Gel mobility shift assays were carried out as described 
(34). An oligonucleotide corresponding to the GAS element from the human 
FC7RI receptor gene (Pearse et al. 1993) was synthesized and used for gel 
20 mobility shift assays. The oligonucleotide has the following sequence: 
5'GATCGAGATGTATTTCCCAGAAAAG3^ (SEQ. ID NO: 17). 

Synthesis of Peptides. Solid phase peptide synthesis was used with either a 
DuPont RAMPS multiple synthesizer or by manual synthesis, C-terminal amino 

25 attached to Wang resin were obtained from DuPont/^^EN. All amino acids were 
coupled as the N-Fmoc pentafluorophenyl esters (Advanced Chemtech), except for 
N-Fmoc, PO-dimethyl-L-phosphotyrosine (Bachem), Double couplings were used. 
Cleavage from resin and deprotection used thioanisol/m-cresol/TFA/TMSBr at 
4^C for 16 hr. Purification used C-18 column HPLC with 0.1% TFA/acetonitrile 

30 gradients. Peptides were characterized by and ^^P NMR, and by Mass Spec, 
and were greater than 95 % pure. 
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Guanidiwn Hydrochloride Treatment, Extracts were incubated with guanidium 
hydrochloride (final concentration was 0.4 to 0,6 M) for two min. at room 
temperature and then diluted with gel shift buffer (final concentration of guanidium 
hydrochloride was 100 mM) and incubated at room temperature for 15 min. 
5 labeled GAS oligonucleotide probe was then added directly to the mixture followed 
by gel mobility shift assay. 

Dissociation'reassociation Analysis. Extracts were incubated with various 
concentrations of peptides or fusion proteins, and ^^P-labeled GAS oligonucleotide 
10 probe in gel shift buffer was then added to promote the formation of protein- 
DNA complex followed by mobility shift analysis. This-assay did not involve 
guanidium hydrochloride treatment. 

Preparation of Fusion Proteins. Bacterially expressed GST fusion proteins were 
15 purified using standard techniques, as described in Birge et al., 1992. Fusion 
proteins were quantified by O.D. absorbance at 280nm. Aliquotes were frozen 
at -70^C, 

Results 

20 

Detection ofLigand Induced Dimer Formation ofStat91 in Solution. In untreated 
cells, Stat91 is not phosphorylated on tyrosine. Treatment with IFN-7 leads 
within minutes to tyrosine phosphorylation and activation of DNA binding 
capacity. The phosphorylated form migrates more slowly during electrophoresis 
25 under denaturing conditions affording a simple assay for the phosphoprotein (31). 

To determine the native molecular weights of the phosphorylated and 
unphosphorylated forms of Stat91, we separated them by affinity purification using 
a biotinylated deoxyoligonucleotide containing a GAS sequence (interferon gamma 
30 activation site) (Figure 16A), The separation of phosphorylated Stat91 from the 
unphosphorylated form was efficient as almost all detectable phosphorylated form 
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could bind to the GAS site while unphosphorylated Stat91 remained unbound. To 
determine the molecular weights of the purified phosphorylated Stat91 and 
unphosphorylated Stat91, samples of each were then subjected lo electrophoresis 
through a set of nondenaturing gels containing various concentrations of 
5 acrylamide followed by Western blot analysis (Figure 16B), Native protein size 
markers (Sigma) were included in the analysis. 

This technique was originally described by Bryan (48) and was recently used for 
dimer analysis (49)- The logic of the technique is that increasing gel 
10 concentrations affect the migration of larger proteins more than smaller proteins, 
and the analysis is not affected by modifications such as protein phosphorylation 
(49). 

A function of the relative mobilities (Rm) was plotted versus the concentration of 
15 acrylamide for each sample to construct Ferguson plots (Figure 16C)* The 
logarithm of the retardation coefficient (calculated from Figure 16C) of each 
sample was then plotted against the logarithm of the relevant molecular weight 
range (Figure 16D). By extrapolation of its retardation coefficient (Figure 16D), 
the native molecular weight of Stat91 from untreated cells was estimated to be 
20 approximately 95 kD, while tyrosine phosphorylated Stat91 was estimated to be 
about twice as large, or approximately 180 kD, Because the calculated molecular 
weight from amino acid sequence of Stat91 is 87 kD, and Stat9l migrates on 
denaturing SDA gels with an apparent molecular weight of 91 kD (see supra, and 
refs. 12 and 45), we concluded that in solution, unphosphorylated Stat91 existed as 
25 a monomer while tyrosine phosphorylated Stat91 is a dimer. 

We also employed glycerol gradient analysis to estimate the native molecular 
weights of both phosphorylated and unphosphorylated Stat91 (Figure 17). Whole 
cell extract of fibroblast cells (Bud8) treated with IFN-7 were pr^ared and 
30 subjected to sedimentation through a 10-40% glycerol gradient. Fractions from 
the gradient were collected and analyzed by both immunoblotting and gel mobility 
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shift analysis (Figure 17A and 17B). As expected, two electrophoretic forms of 
Stat91 could be detected by immunoblotting (Figure 17A): the slow-migrating 
form (tyrosine phosphorylated) and the fast-migrating form (unphosphorylated; 
Figure 17A). The phosphorylated Stat91 sedimented more rapidly than the 
5 unphosphorylated form. Again, using molecular weight markers, the native 

molecular weight of the unphosphorylated form of Stat91 appeared to be about 90 
kD while the tyrosine phosphorylated form of Stat91 was about ISOkD (Figure 
17C), supporting the conclusion that unphosphorylated Stat91 existed as a 
monomer in solution while the tyrosine phosphorylated form exists as a dimer. 
10 When firactions from the glycerol gradients were analyzed by electrophoretic 

mobility shift analysis (Figure 17B), the peak of the phosphorylated form of Stat91 
correlated well with the DNA-binding activity of Stat9L Thus only the 
phosphorylated dimeric Stat91 has the sequence-specific DNA recognition 
capacity. 

15 

Stat91 Binds DNA as a Dimer. Long or short versions of DNA binding protein 
can produce, respectively, a slower or a faster migrating band during gel 
retardation assays. Finding intermediate gel shift bands produced by mixing two 
different sized species provides evidence of dimerization of the DNA binding 

20 proteins. Since Stat91 requires specific tyrosine phosphorylation in ligand-treated 
cells for its DNA binding, we sought evidence of formation of such heterodimers, 
first in transfected cells. An expression vector (MNC911) encoding Stat91L, a 
recombinant form of Stat91 containing an additional 34 amino acid carboxyl 
terminal tag was generated. [The extra amino acids were encoded by a segment of 

25 DNA sequence from plasmid pMNC (see Materials and Methods).] A Stat84 
expression vector (MNC84) was also available (45). From soniatic cell genetic 
experiments, mutant human cell lines (U3) are known that lack the Stat9I/84 
mRNA and proteins (29,30). The U3 cells were therefore separately transfected 
with vectors encoding Stat84 (MNC84) or Stat91L (MNC91L) or a mixture of 

30 both vectors. Permanent transfectants expressing Stat84 (C84), Stat91L (C91L) or 
both proteins (Cmx) were isolated (Figure 18A). 
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Mobility shift analysis was performed with extracts from these stable cell lines 
(Figure 18B). Extracts of IFN-7-treated C84 cells produced a faster migrating gel 
shift band than extracts of treated C91L cells. Most importantly, extracts from 
IFN-7-treated Cmx cells expressing both Stat84 and Stat91L proteins formed an 
5 additional intermediate gel shift band. Anti-91, an antiserum against the C- 
terminal 38 amino acids of Stat91 (12) that are absent in Stat84, specifically 
removed the top two shift bands seen with the Cmx extracts. Anti-91, an 
antiserum against amino acids 609 to 716 (15) that recognizes both Stat91L and 
Stat84, proteins inhibited the binding of all three shift bands. Thus, the middle 
10 band formed by extracts of the Cmx cells is clearly identified as a heterodimer of 
Stat84 and Stat91L, We concluded that both Stat91 and Stat84 bind DNA as 
homodimers and, if present in the same cell, will form heterodimers. 

We next wanted to detect the formation of dimers in vitro. When cytoplasmic or 
15 nuclear extracts of IFN-7-treated C84 or C91L cells were mixed and analyzed 
(Figure 19), only the fast or slow migrating gel shift bands were observed. Thus 
it appeared that once formed in vivo, the dimers were stable. To promote the 
formation of protein interchange between the subunits of the dimer, a mixture of 
either cytoplasmic or nuclear extracts of IFN-7-treated C84 or C91L cells were 
20 subjected mild denaturation-renaturation treatment: extracts were made 0.5 M 
with respect to guanidium hydrochloride for two minutes and then diluted for 
renaturation and subsequently used for gel retardation analysis. The formation of 
heterodimer was clearly detected after this treatment. When extracts from either 
C84 cells alone or C91L cells alone were subjected to the same treatment, the 
25 intermediate band did not form. The intermediate band was again proven by 
antiserum treatment to consist of Stat84/Stat91L dimer (data not shown). 

This experiment defined conditions under which the dimer was stable, but also 
showed that dissociation and reassociation of the dimer in vitro was possible. 
30 Since guanidium hydrochloride is known to disrupt only non-covalent chemical 
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bonds, it seemed that Stat91 (or Stat84) homodimerization was mediated through 
non-covalent interactions. 

Dimerization ofStax91 Involves Phosphotyrosyl Peptide and SH2 Interactions. 
5 Based on the resxilts described above, we devised a dissociation-reassociation assay 
in the absence of guanidium hydrochloride to explore the possible nature of 
interactions involved in dimer formation (Figure 20). When the short and the long 
forms of a homodimer are mixed with a dissociating agent (e.g., a peptide 
containing the putative dimerization domain), the subunits of the dimer should 

10 dissociate (in a concentration dependent fashion) due to the interaction of the agent 
with the dimerization domaih(s) of the protein. When a'specific DNA probe is 
subsequently added to the mixture to drive the formation of a stable protein-DNA 
complex, the detection of any reassociated or remaining dimers can be assayed. In 
the presence of low concentration of the dissociating agent, addition of DNA to 

15 form the stable protein-DNA complex should lead to the detection of homodimers 
as well as heterodimers. At high concentration of the dissociating agent, subunits 
of the dimer may not be able to re-form and no DNA-protein complexes would be 
detected (Figure 20). 

20 The Stat91 sequence contains an SH2 domain (amino acids 569 to 700, see 
discussion below), and we knew that Tyr-701 was the single phosphorylated 
tyrosine residue required for DNA binding activity (supra, 45). Furthermore, we 
have observed that phosphotyrosine at 10 mM, but not phosphoserine or 
phosphothreonine, could prevent the formation of Stat91-DNA complex. We 

25 therefore sought evidence that the dimerization of Stat91 involved specific SH2- 
phosphotyrosine interaction using the dissociation and reassociation assay. 

In order to evaluate the role of the SH2-phosphotyrosine interation, two peptides 
fragments of Stat91 corresponding to segments of the SH2 and phosphotyrosing 
30 domains of Stat91 were prepared: a non-phosphorylated peptide (91Y), 
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LDGPKGTGYIKTELI (SEQ, ID NO: 18) (corresponding to amino acids 693-707), 
and a phosphotyrosyl peptide (91Y-p), GY*IKTE (SEQ. ID 
NO: 19) (representing residues 700-705). 

5 Activated Stat84 or Stat91L was obtained from IFN-7-treated C84 or C91L cells 
and mixed in the presence of various concentrations of the peptides followed by 
gel mobility shift analysis. The non-phosphorylated peptide had no effect on the 
presence of the two gel shift bands characteristic of Stat84 or Stat91L homodimers 
(Figure 21, lane 2-4) • In contrast, the phosphorylated peptide (91Y-p) at the 
10 concentration of 4 /xM clearly promoted the exchange between the subunits of 
Stat84 dimers and Stat91L dfmers to form heterodimers (Figure 21, lane 5). At a 
higher concentration (160 ^M), peptide 91Y-p but not the unphosphorylated 
peptide dissociated the dimers and blocked the formation of DNA protein 
complexes (Figure 21, lane 7), 

15 

When cells are treated with IFN-a both Stat91 (or 84) and Statll3 become 
phosphorylated (15). Antiserum to StatllS can precipitate both StatllS and Stat91 
after IFN-a-treatment but not before, suggesting BFN-a dependent interaction of 
these two proteins, perhaps as a heterodimer (15). 

20 

In StatllS, tyr-690 in the homologous position to Tyr-701 in Stat91 is the single 
target residue for phosphorylation. Amino acids downstream of the affected 
tyrosine residue show some homology between the two proteins. We therefore 
prepared a phosphotyrosyl peptide of StatllS (llSY-p), KVNLQERRKY*LKHR 
25 (SEQ. ID NO:20) [amino acids 681 to 694; (38)]. At concentrations similar to 
91Y-p, 113Y-P also promoted the exchange of subunits between the Stat84 and 
Stat91L, while at a high concentration (40/iM), 113Y-p prevented the gel shift 
bands almost completely (Figure 21, lane 8-10). 

30 We prepared a phosphotyrosyl peptide (SrcY-p), EPQY*EEIPIYL (SEQ. ID 

NO: 21) which is known to interact with the Src SH2 domain with a high affinity 
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(50). This peptide showed no effect on the Stat91 dimer formation (Figure 21, 
lane 11-13). Thus, it seems that Stat91 dimerization involves SH2 interaction with 
tyrosine residues in specific peptide sequence. 

5 To test further the specificity of Stat91 dimerization mediated through specific- 
phosphotyrosyl-peptide 5112 interaction, a fusion product of glutathione-S- 
transferase with the Stat91-SH2 domain (GST-91SH2) was prepared (Figure 22A) 
and used in the in vitro dissociation reassociation assay. At concentrations of 0.5 
to 5 uM, the Stat91-SH2 domain promoted the formation of a heterodimer (Figure 
10 22B, lanes 5-7). In contrast, neither GST alone, nor fusion products with a 
mutant (R^'>V^) Stat91-SH2 domain (GST 91mSH2) that renders Stat91 non- 
functional in vivo, a Stat91 SH3 domain (GST-91SH3), nor the Src SH2 domain 
(GST-SrcSH2), induced the exchange of subunits between the Stat84 and Stat91L 
homodimers (Figure 22B). 

15 

Discussion 



The initial sequence analysis of the Stat91 and Statll3 proteins revealed the 
presence of SH2 like domains (see 13,38). Further it was found that STAT 

20 proteins themselves are phosphorylated on single tyrosine residues during their 
activation (15,31). Single amino acid mutations either removing the Stat91 
phosphorylation site, Tyr-701, or converting Arg-702 to Leu in the highly 
conserved "pocket" region of the SH2 domain abolished the activity of Stat91 (45). 
Thus it seemed highly likely that one possible role of the STAT SH2 domains 

25 would be to bind the phosphotyrosine residues in one of the JAK kinases. 

i 

Since the activated STATs have phosphotyrosine residues and SB2 domains, a 
second suggested role for SH2 domains was in protein-protein interactions within 
the STAT family. By two physical criteria — electrophoresis in native gels and 
30 sedimentation on gradients — Stat91 in untreated cells is a monomer and in treated 
cells is a dimer (Figures 16-18). Since phosphotyrosyl peptides from Stat91 or 
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Status and the SH2 domain of Stat91 could efficiently promote the formation of 
herterodimers between Stat91L and Stat84 in a disassociation and reassociation 
assay, we conclude that dimerization of Stat91 involves SH2-phosphotyrosyl 
peptide interactions. 

5 

The possibility of an SH2 domain in Stat9I was indicated initially by the presence 
of highly conserved amino acid stretches between the Stat91 and Statll3 sequences 
in the 569 to 700 residue region, several of which, especially the FLLR sequence 
in the amino terminal end of the region, are characteristia of -SH2 domains. The 

10 C- terminal half of the SH2 domains are less well conserved in general (39); this 
was also true for the STAT proteins compared to other proteins, although Stat91 
and Status are quite similar in this region (38, 13, Figure 23), The available 
structures of Ick, src, abl, and p85a SH2's permit identification of structurally 
conserved regions (SCR's), and detailed alignment of amino acid sequences of 

15 several proteins (Figure 23) is based on these. 

The characteristic W (in BAl) is preceded by hydrophilic residues and is followed 
by hydrophobic residues in Stat91, but alignment to the W seems justified, even if 
the small beta sheet of which the W is part is shifted in Stat9L The three 

20 positively charged residues contributing to the phosphotyrosyl binding site are at 
the positions indicated as aIphaA2, betaB5, and betaD5. Figure 23 shows an 
alignment which accomplishes this by insertions in the 'AA' and 'CD* regions. 
This is a different alignment from that previously suggested (38), and gives a 
satisfactory alignment in the (beta)D region, although, like the previous alignment, 

25 it is obviously considerably less similar to the other SH2's in the C-terminus. 

i 

This alignment suggests that the SH2 domain in the Stat91 would end in the 
vicinity of residue 700. In such an alignment, the Tyr-701 occurs almost 
immediately after the SH2 domain: a distance too short to allow an intramolecular 
30 phosphotyrosine -SH2 interaction. Since the data presented earlier strongly 

implicate that an SH2 -phosphotyrosine interaction is involved in dimerization, such 
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an interaction is likely to be between two phospho Stat91 subunits as a reciprocal 
pTyr -SH2 interaction. 



The apparent stability of Stat91 dimer may be due to a high association rate 
5 coupled with a high dissociation rate of SH2-phosphotyrosyl peptide interactions as 
suggested (Felder et al., 1993, MoL Cell Biol. 13:1449-1455) coupled with 
interactions between other domains of Stat91 that may contribute stability to the 
Stat91 dimer. Interference by homologous phosphopeptides with the -SH2- 
phosphotyrosine interaction would then lower stability sufficiently to allow 
10 complete dissociation and heterodimerization. 

The dimer formation between phospho Stat91 is the first case in eukaryotes where 
dimer formation is regulated by phosphorylation, and the only one thus far 
dependent on tyrosine phosphorylation. We anticipate that dimeriiation with the 

15 STAT protein family will be important. It seems likely that in cells treated with 
IFN-a, there is Statll3-Stat91 interaction (15). This may well be mediated 
through SH2 and phosphotyrosyl peptide interactions as described above, leading 
to a complex (a probable dimer of Stat91-Statll3) which joins with a 48 kD DNA 
binding protein (a member of another family of DNA binding factors) to make a 

20 complex capable of binding to a different DNA site. Furthermore, we have 

recently cloned two mouse cDNAs which encode other STAT family members that 
have conserved the same general structure features observed in the Stat91 and 
Status molecules (see Example 5, Supra). (U.S. Application Serial No. 
08/126,588, filed September 29,1993, which is specifically incorporated herein by 

25 reference in its entirety). Thus the specificity of STAT-containing complexes will 
almost surely be affected by which proteins are phosphorylated and then available 
for dimer formation. 

The following is a list of references related to the above disclosure and particularly 
30 to the experimental procedures and discussions. The references are numbered to 
correspond to like number references that appear hereinabove. 
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This invention may be embodied in other forms or carried out in other ways 
10 without departing from the spirit or essential characteristics thereof. The present 
disclosure is therefore to be OTnsidered as in all respects illustrative and not 
restrictive, the scope of the invention being indicated by the appended Claims, and 
all changes which come within the meaning and range of equivalency are intended 
to be embraced therein. 



WHAT IS CLAIMED IS : 
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1 L A receptor recognition factor implicated in the transcriptional stimulation of 

2 genes in target cells in response to the binding of a specific polypeptide ligand to 

3 its cellular receptor on said target cell, said receptor recognition factor having the 

4 following characteristics: 

5 a) apparent direct interaction with the ligand-bound receptor and 

6 activation of one or more transcription factors capable of binding with a specific 

7 gene; 

8 b) an activity demonstrably unaffected by the presence or concentration 

9 of second messengers; 

10 c) direct interaction with tyrosine kinase domains; and 

11 * d) a perceived absence of interaction with G-proteins. 

1 2. The receptor recognition factor of Claim 1 which is proteinaceous in 

2 composition. 

1 3. The receptor recognition factor of Claim 1 which is cytoplasmic in origin, 

1 4. The receptor recognition factor of Claim 1 which is a polypeptide having 

2 an amino acid sequence selected from the group consisting of SEQ ID NO:2, SEQ 

3 ID NO: 10 and SEQ ID NO: 12. 

1 5. The receptor recognition factor of Claim 1 which is derived from 

2 mammalian cells. 

1 6. The receptor recognition factor of Claim 1 labeled with a detectable label. 

1 7, The receptor recognition factor of Claim 6 wherein the label is selected 

2 from enzymes, chemicals which fluoresce and radioactive elements. 
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1 8. An antibody to a receptor recognition factor, the factor to which said 

2 antibody is raised having the following characteristics: 

3 a) apparent direct interaction with the ligand-bound receptor and 

4 activation of one or more transcription factors capable of binding with a specific 

5 gene; 

6 b) an activity demonstrably unaffected by the presence or concentration 

7 of second messengers; and 

8 c) direct interaction with tyrosine kinase domains; and 

9 d) a perceived absence of interaction with G-proteins. 

1 9. The antibody of Claim 8 which is a polyclonal antibody. 
1 10. The antibody of Claim 8 which is a monoclonal antibody. 

1 11. An immortal cell line that produces a monoclonal antibody according to 

2 Claim 10. 

1 12. The antibody of Claim 8 labeled with a detectable label. 

1 13. The antibody of Claim 12 wherein the label is selected from enzymes, 

2 chemicals which fluoresce and radioactive elements. 

1 14. A DNA sequence or degenerate variant thereof, which encodes a receptor 

2 recognition factor, or a fragment thereof, selected from the group consisting of: 

3 (A) tiie DNA sequence of FIGURE 1; \ 

4 (B) tiie DNA sequence of FIGURE 14; 

5 (C) the DNA sequence of FIGURE 15; 

6 (D) DNA sequences that hybridize to any of the foregoing DNA 

7 sequences under standard hybridization conditions; and 

8 (E) DNA sequences that code on expression for an amino acid sequence 

9 encoded by any of the foregoing DNA sequences. 
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1 15. A recombinant DNA molecule comprising a DNA sequence or degenerate 

2 variant thereof, which encodes a receptor recognition factor, or a firagment 

3 thereof, selected from the group consisting of: 

4 (A) the DNA sequence of FIGURE 1; 

5 (B) the DNA sequence of FIGURE 14; 

6 (C) the DNA sequence of FIGURE 15; 

7 (D) DNA sequences that hybridize to any of the foregoing DNA 

8 sequences under standard hybridization conditions; and 

9 (E) DNA sequences that code on expression for an amino acid sequence 
10 encoded by any of the foregoing DNA sequences. 

1 16. The recombinant DNA molecule of either of Claims 14 or 15, wherein said 

2 DNA sequence is operatively linked to an expression control sequence. 

1 17. The recombinant DNA molecule of Claim 16, wherein said expression 

2 control sequence is selected from the group consisting of the early or late 

3 promoters of SV40 or adenovirus, the lac system, the ^ system, the TAC system, 

4 the TRC system, the major operator and promoter regions of phage X, the control 

5 regions of fd coat protein, the promoter for 3-phosphoglycerate kinase, the 

6 promoters of acid phosphatase and the promoters of the yeast a-mating factors. 

1 18. A probe capable of screening for the receptor recognition factor in alternate 

2 species prepared from the DNA sequence of Claim 14, 

1 19. A unicellular host transformed with a recombinant DNA^ molecule 

2 comprising a DNA sequence or degenerate variant thereof, which encodes a 

3 receptor recognition factor, or a fragment thereof, selected from the group 

4 consisting of: 

5 (A) the DNA sequence of FIGURE 1; 

6 (B) the DNA sequence of FIGURE 14; 
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7 (C) the DNA sequence of FIGURE 15; 

8 (D) DNA sequences that hybridize to any of the foregoing DNA 

9 sequences under standard hybridization conditions; and 

10 (E) DNA sequences that code on expression for an amino acid sequence 

1 1 encoded by any of the foregoing DNA sequences; 

12 wherein said DNA sequence is operatively linked to an expression control 

13 sequence. 

1 20. The unicellular host of Claim 19 wherein the unicellular host is selected 

2 from the group consisting of E. coli . Pseudomonas , Bacillus . Streptomyces , 

3 yeasts, CHO, RLl, B-W, LtM, COS 1, COS 7, BSCl, BSC40, and BMTIO cells, 

4 plant cells, insect cells, and human cells in tissue culture. 

1 21. A method for detecting the presence or activity of a receptor recognition 

2 factor, said receptor recognition factor having the following characteristics: 

3 apparent direct interaction with the ligand-bound receptor and activation of one or 

4 more transcription factors capable of binding with a specific gene; an activity 

5 demonstrably unaffected by the presence or concentration of second messengers; 

6 direct interaction with tyrosine kinase domains; and a perceived absence of 

7 interaction with G-proteins, wherein said receptor recognition factor is measured 

8 by: 



9 A. contacting a biological sample from a mammal in which the 

10 presence or activity of said receptor recognition factor is suspected with a binding 

1 1 partner of said receptor recognition factor under conditions that allow binding of 

12 said receptor recognition factor to said binding partner to occur; and 

13 B, detecting whether binding has occurred between said receptor 

14 recognition factor from said sample and the binding partner; 

15 wherein the detection of binding indicates that presence or activity of said 

16 receptor recognition factor in said sample. 
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1 22. A method for detecting the presence and activity of a polypeptide ligand 

2 associated with a given invasive stimulus in mammals comprising detecting the 

3 presence or activity of a receptor recognition factor according to the method of 

4 Claim 21, wherein detection of the presence or activity of the receptor recognition 

5 factor indicates the presence and activity of a polypeptide ligand associated with a 

6 given invasive stimulus in mammals. 

1 23. The method of Claim 22 wherein said invasive stimulus is an infection. 

1 24. The method of Claim 22 wherein said invasive stimulus is selected from 

2 the group consisting of viral.. infection, protozoan infection, tumorous mammalian 

3 cells, and toxins. 

1 25, A method for detecting the binding sites for a receptor recognition factor, 

2 said receptor recognition factor having the following characteristics: 

3 apparent direct interaction with the ligand-bound receptor and activation of 

4 one or more transcription factors capable of binding with a specific gene; 

5 an activity demonstrably unaffected by the presence or concentration of 

6 second messengers; 

7 direct interaction with tyrosine kinase domains; and 

8 a perceived absence of interaction with G-proteins; wherein the binding 

9 sites for said receptor recognition factor are measured by: 

10 A. placing a labeled receptor recognition factor sample in 

1 1 contact with a biological sample from a mammal in which binding sites for said 

12 receptor recognition factor are suspected; 

13 B, examining said biological sample in binding studies for the 

14 presence of said labeled receptor recognition factor; 

15 wherein the presence of said labeled recognition factor indicates a binding 

16 site for a receptor recognition factor. 
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1 26. A method of testing the ability of a drug or other entity to modulate the 

2 activity of a receptor recognition factor which comprises 

3 A. culturing a colony of test cells which has a receptor for the 

4 receptor recognition factor in a growth medium containing the receptor recognition 

5 factor; 

6 B. adding the drug under test; and 

7 C, measuring the reactivity of said receptor recognition fector with the 

8 receptor on said colony of test cells, 

9 wherein said receptor recognition factor has the following characteristics: 

10 a) apparent direct interaction with the ligand-bound rec^tor and 

11 activation of one or more trainscription factors capable ofbinding with a specific 

12 gene; 

13 b) an activity demonstrably unaffected by the presence or concentration 

14 of second messengers; 

15 c) direct interaction with tyrosine kinase domains; and 

16 d) a perceived absence of interaction with G-proteins, 

1 27. An assay system for screening drugs and other agents for ability to 

2 modulate the production of a receptor recognition factor, comprising: 

3 A. culturing an observable cellular test colony inoculated with a drug 

4 or agent; 

5 B. harvesting a supernatant from said cellular test colony; and 

6 C. examining said supernatant for the presence of said receptor 

7 recognition factor wherein an increase or a decrease in a level of said receptor 

8 recognition factor indicates the ability of a drug to modulate the activity of said 

9 receptor recognition factor, said receptor recognition factor having the following 

10 characteristics: 

11 a) apparent direct interaction with the ligand-bound receptor and 

12 activation of one or more transcription factors capable of binding with a specific 

13 gene; 
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14 b) an activity demonstrably unaffected by the presence or concentration 

15 of second messengers; 

16 c) direct interaction with tyrosine kinase domains; and 

17 d) a perceived absence of interaction with G-proteins. 

1 28. A test kit for the demonstration of a receptor recognition factor in a 

2 eukaryotic cellular sample, comprising: 

3 A. a predetermined amount of a detectably labelled specific binding 

4 partner of a receptor recognition factor, said receptor recognition factor having the 

5 following characteristics: apparent direct interaction with the ligand-bound receptor 

6 and activation of one or more transcription factors capable of binding with a 

7 specific gene; an activity demonstrably unaffected by the presence or concentration 

8 of second messengers; direct interaction with tyrosine kinase domains; and a 

9 perceived absence of interaction with G-proteins; 

10 B. other reagents; and 

11 C. directions for use of said kit. 



29. A test kit for demonstrating the presence of a receptor recognition factor in 
a eukaryotic cellular sample, comprising: 

A. a predetermined amount of a receptor recognition factor, said 
receptor recognition factor having the following characteristics: apparent direct 
interaction with the ligand-bound receptor and activation of one or more 
transcription factors capable of binding with a specific gene; an activity 
demonstrably unaffected by the presence or concentration of second messengers; 
direct interaction with tyrosine kinase domains; and a perceived absence of 
interaction with G-proteins; 

B. a predetermined amount of a specific binding partner of said 
receptor recognition factor; 

C. other reagents; and 
directions for use of said kit; 
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wherein either said receptor recognition factor or said specific binding 
partner are detectably labelled. 

1 30. The test kit of Claim 28 or 29 wherein said labeled immunochemically 

2 reactive component is selected from the group consisting of polyclonal antibodies 

3 to the receptor recognition factor, monoclonal antibodies to the receptor 

4 recognition factor, fragments thereof, and mixtures thereof. 

1 31. A method of preventing and/or treating cellular debilitations, derangements 

2 and/or dysfunctions and/or other disease states in mammals, comprising 

3 administering to a mammal a- therapeutically effective amount of a material 

4 selected from the group consisting of a receptor recognition factor, an agent 

5 capable of promoting the production and/or activity of said receptor recognition 

6 factor, an agent capable of mimicking the activity of said receptor recognition 

7 factor, an agent capable of inhibiting the production of said receptor recognition 

8 factor, and mixtures thereof, or a specific binding partner thereto, said receptor 

9 recognition factor having the following characteristics: 

10 a) apparent direct interaction with the ligand-bound receptor and 

1 1 activation of one or more transcription factors capable of binding with a specific 

12 gene; 

13 b) an activity demonstrably unaffected by the presence or concentration 

14 of second messengers; 

15 c) direct interaction with tyrosine kinase domains; and. 

16 d) a perceived absence of interaction with G-proteins. 

1 32, The method of Claim 31 wherein said disease states include chronic viral 

2 hepatitis, hairy cell leukemia, and tumorous conditions. 

1 33. The method of Claim 31 wherein said receptor recognition factor is 

2 administered to modulate the course of therapy where interferon is being 

3 administered as the primary therapeutic agent. 
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1 34, The methcxl of Claim 31 wherein said receptor recognition factor is 

2 suiministered to modulate the course of therapy where interferon is being co- 

3 administered with one or more additional therapeutic agents. 

1 35. A pharmaceutical composition for the treatment of cellular debilitation, 

2 derangement and/or dysfunction in mammals, comprising: 

3 A, a therapeutically effective amount of a material selected from 

4 the group consisting of a receptor recognition factor, an agent capable of 

5 promoting the production and/or activity of said receptor recognition factor, an 

6 agent capable of mimicking the activity of said receptor recognition factor, an 

7 agent capable of inhibiting the production of said receptor recognition factor,, and 

8 mixtures thereof, or a specific binding partner thereto, said receptor recognition 

9 factor having the following characteristics: apparent direct interaction with the 

10 ligand-bound receptor and activation of one or more transcription factors capable 

11 of binding with a specific gene; an activity demonstrably unaffected by the 

12 presence or concentration of second messengers; direct interaction with tyrosine 

13 kinase domains; and a perceived absence of interaction wiUi G-proteins; and 

14 B. a pharmaceutically acceptable carrier. 

1 36, A receptor recognition factor implicated in the transcriptional stimulation of 

2 genes in target cells in response to the binding of a specific polypeptide ligand to 

3 its cellular receptor on said target cell, said receptor recognition factor having the 

4 following properties: 

5 a) it is present in cytoplasm; 

6 b) it undergoes tyrosine phosphorylation upon treatment of cells with 



7 



IFNa; 



9 



8 



c) it activates transcription of an interferon stimulated gene; 

d) it stimulates either an ISRE-dependent or a gamma activated site 



10 



11 



(GAS)-dependent transcription in vivo : 

e) it interacts with IFNa cellular receptors, and 
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12 0 it undergoes nuclear translocation upon stimulation of the IFN cellular 

13 recq)tors with IFNa. 

1 37, A receptor recognition factor implicated in the transcriptional stimulation of 

2 genes in target cells in response to the binding of an interferon or interferon- 

3 related polypeptide ligand to its cellular receptor on said target cell, said receptor 

4 recognition factor having the following properties: 

5 a) it is present in vivo in mammalian cytoplasm before activation of 

6 cellular IFN receptors; 

7 b) it contains tyrosine sites that are phosphorylated in response to IFN 

8 stimulation of IFN receptors; 

9 c) it has a molecular weight selected from the group consisting of 48kD, 

10 84kD, 91kD and 113kD, or an amino acid sequence selected from the group 

11 consisting of SEQ ED NO: 10 and SEQ ID NO: 12, and 

12 d) when phosphorylated, it recognizes an ISRE in the cell nucleus. 

1 38. The receptor recognition factor of either of Claims 36 or 37 in 

2 phosphorylated form, 

1 39. An antibody which recognizes a phosphorylated ISGF3 polypeptide or a 

2 fragment thereof in phosphorylated form, 

1 40. An antibody produced by injecting a substantially immunocompetent host 

2 with an antibody-producing effective amount of an ISGF3 polypeptide, and 

3 harvesting said antibody, said ISGF3 polypeptide having the following properties: 

4 a) it has a molecular weight of about 48kD, 84Kd, 91 kd or 113kD or an 

5 amino acid sequence selected from the group consisting of SEQ ID NO: 10 and 

6 SEQ ID NO: 12; 

7 b) it can be isolated from mammalian cytoplasm; 

8 c) it contains tyrosine residues that are subject to phosphorylation in vivo 

9 upon treatment of cells with IFNa; 
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10 d) it can activate transcription of an interferon stimulated gene ia vivg; 

11 e) it can stimulate ISRE-dependent transcription in vivo : 

12 f) it can interact with IFNa cellular receptors, and 

13 g) it can undergo nuclear translocation upon stimulation of IFN cellular 

14 reenters with IFNa. 

1 41. The antibody of either of Claims 39 or 40 which is monoclonal, 

1 42. The antibody of either of Claims 39 or 40 which is polyclonal, 

1 43. A recombinant virus transformed with the DNA molecule, or a derivative 

2 or firagment thereof, in accordance with Claim 14. 

1 44. A recombinant virus transformed with the DNA molecule, or a derivative 

2 or fragment thereof, in accordance with Claim 15. 

1 45. A method of enhancing IFNa activity in a mammal in need of such 

2 treatment, comprising administering to said mammal an effective amount of a 

3 compound which (a) enhances the phosphorylation of intracellular ISGF3 proteins 

4 to form ISGF3-protein phosphates, or (b) inhibits the activity of a phosphatase 

5 enzyme which would otherwise reduce the level of phosphorylated ISGF3 proteins. 

1 46. A method of treating (a) chronic viral hepatitis or (b) hairy cell leukemia, 

2 in a mammal in need of such treatment, comprising administeririg to said mammal 

3 an effective amount of a compound which (a) enhances the phosphorylation of 

4 ISGF3 proteins, or (b) decreases the level of phosphate removal from 

5 phosphorylated ISGF3 proteins. 

1 47. The method of Claim 45 wherein the activity of exogenous IFNa is 

2 enhanced. 
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1 48. The method of Claim 45 wherein the activity of endogenous IFNa is 

2 enhanced. 

1 49. The method of Claim 47 wherein the compound and IFNa are administered 

2 concurrently to the mammal in need of such treatment, 

1 50. A method of determining the interferon-related pharmacological activity of 

2 a compound comprising: 

3 administering the compound to a mammal; 

4 determining the level of phosphorylated ISGF3 proteins present; and 

5 comparing the level of ISGF3 protein-phosphate to a standard, 

1 51. In a method of treating hepatitis or leukemia in a mammal, wherein XFNo? 

2 is administered in an amount effective for treating such hepatitis or leukemia, the 

3 improvement comprising administering to said mammal an ISGF3 protein or a 

4 derivative thereof in an amount effective for enhancing the activity of said IFNa. 

1 52. The method of Claim 51 wherein a derivative of said ISGF3 protein is 

2 administered. 

1 53. The method of Claim 51 wherein an ISGF3 protein is administered, having 

2 a molecular weight of about 48 kD, 84kD, 91kD or 113kD, 

1 54. The method of Claim 52 wherein the derivative is a phosphorylated ISGF3 

2 protein. 

1 55. The recombinant DNA molecule of Claim 16 comprising plasmid pGEX- 

2 3X, clone E3 or plasmid pGEX-3X, clone E4. 

1 56. An antisense nucleic acid against a receptor recognition factor mRNA 

2 comprising a nucleic acid sequence hybridizing to said mRNA. 
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1 57. The antisense nucleic acid of Claim 56 which is RNA. 

1 58. The antisense nucleic acid of Claim 56 which is DNA. 

1 59. The antisense nucleic acid of Claim 56 which binds to the initiation codon 

2 of any of said nvRNAs. 

1 60. A recombinant DNA molecule having a DNA sequence which, on 

2 transcription, produces an antisense ribonucleic acid against a receptor recognition 

3 factor mRNA, said antisense ribonucleic acid comprising an nucleic acid sequence 

4 capable of hybridizing to said mRNA. 

1 61. A receptor recognition factor-producing cell line transfected with the 

2 recombinant DNA molecule of Claim 60. 

1 62. A method for creating a cell line which exhibits reduced expression of a 

2 receptor recognition factor, comprising transfecting a recognition factor-producing 

3 cell line with a recombinant DNA molecule of claim 60. 

1 63. A ribozyme that cleaves receptor recognition factor mRNA. 

1 64. The ribozyme of Claim 63 which is a Tetrahymena- type ribozyme. 

1 65. The ribozyme of Claim 63 which is a Hammerhead-type ribozyme. 

1 66. A recombinant DNA molecule having a DNA sequence which, upon 

2 transcription, produces the ribozyme of claim 63. 

1 67. A receptor recognition factor-producing cell line transfected with the 

2 recombinant DNA molecule of claim 66. 



r 



106 

1 68. A method for creating a cell line which exhibits reduced expression of a 

2 receptor recognition factor, comprising transfecting a recognition factor*producing 

3 cell line with the recombinant DNA molecule of claim 63. 
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ABSTRACT 



Receptor recognition factors exist that recognizes the specific cell receptor to 
which a specific ligand has been bound, and that may thereby signal and/or initiate 
the binding of the transcription factor to the DNA site. The receptor recognition 
5 factor is in one instance, a part of a transcription factor, and also may interact 
with other transcription factors to cause them to activate and travel to the nucleus 
for DNA binding. The receptor recognition factor appears to be second- 
messenger-independent in its activity, as overt perturbatidns in second messenger 
concentrations are of no effect. The concept of the invention is illustrated by the 

10 results of studies conducted \vith interferon (IFN)-stimulated gene transcription, 
and particularly, the activation caused by both IFNa and IFN7. Specific DNA 
and amino acid sequences for various human and murine receptor recognition 
factors are provided, as are polypeptide fragments of two of the ISGF-3 genes, 
and antibodies have also been prepared and tested. The polypeptides confirm 

15 direct involvement of tyrosine kinase in intracellular message transmission. 
Numerous diagnostic and therapeutic materials and utilities are also disclosed. 
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ACTGCAACCCTAATCAGAGCCCAA 
10 

asn leu asp ser pro phe gin 
AAT CTT GAC AGC CCC TTT CAG 

30 

his ser leu leu pro val asp 
CAG AGC CTC CTG- CCT GTG GAC 

40 

ile glu asp gin asn trp gin 
ATT GAA GAC CAG AAC TGG CAG 

60 

ser lys ala thr met leu phe 
TCC AAG GCT ACC ATG CTA TTC 

70 

tyr glu cys gly arg cys ser 
TAT GAG TGT GGC CGT TGC AGC 

90 

gin his asn leu arg lys phe 

CAG CAC AAT TTG CGG AAA TTC 
100 

gin asp pro thr gin leu ala 
CAG GAT CCT ACC CAG TTG GCT 

120 

glu glu lys arg ile, leu lie 
GAA GAA AAA AGA ATT TTG ATC 

130 

gin gly glu pro val leu glu 
CAA GGA GAG CCA GTT CTC GAA 

150 

glu ile glu ser arg ile leu 
GAG ATT GAA TCC CGG ATC CTG 

160 

leu val lys ser ile ser gin 
CTG GTA AAA TCC ATC AGC CAA 

100 



1 

met ala gin trp glu met leu gin 
ATG GCG CAG- TGG GAA ATG CTG CAG 

20 

asp gin leu his gin leu tyr ser 
GAT CAG CTG CAC CAG CTT TAG TCG 



ile arg gin tyr leu ala val trp 
ATT CGA CAG TAG TTG GCT GTC TGG 

50^ 

glu ala ala leu gly^ser asp asp 
GAA GCT GCA CTT GGG AGT GAT GAT 



phe his phe leu asp gin leu asn. 
TTC CAC TTC TTG GAT CAG CTG AAC 

00 

gin asp pro glu ser leu leu leu 
CAG GAC CCA GAG TCC TTG TTG CTG 



cys arg asp ile gin pro phe ser 

TGC CGG GAC ATT CAG CCC TTT TCC 
110 

glu met ile phe asn leu leu leu 
GAG ATG ATC TTT AAC CTC CTT CTG 



gin ala gin arg ala gin leu glu 
CAG GCT CAG AGG GCC CAA TTG GAA 

140 

thr pro val glu ser gin gin his 
ACA CCT GTG GAG AGC CAG CAA CAT 



asp leu arg ala met met glu lys 
GAT TTA AGG GCT ATG ATG GAG AAG 

170 

leu lys asp gin gin asp val phe 
CTG AAA GAC CAG CAG GAT GTC TTC 
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cy3 phe arg tyr lys ile gin 
TGC TTC CGA TAT AAG ATC CAG 

190 

asp pro his gin thr lys glu 
GAC CCC CAT CAG ACC AAA GAG 

210 

asn glu leu asp lys arg arg 
AAT GAA CTG GAC .AAA AGG AG A 

220 

ala leu leu gly arg leu thr 
GCA CTG CTA GGC CGA TTA ACT 

240 

lys leu glu glu trp lys ala 
AAG TTG GAG GAG TGG AAG GCC 

250 

ala pro lie asp his gly leu 
GCT CCC ATT GAC CAC GGG TTG 

270 

ala gly ala lys Igu leu phe 
GCT GGA GCA AAG CTG TTG TTT 

200 

leu lys gJy leu ser cys leu 
CTG AAG GGA CTG AGT TGC CTG 

300 

Lhr lys gly val asp leu arg 
ACC AAA CGG GTG GAC CTA CGC 

310 

gin arg jLeu leu his. arg ala 
CAG CGT CTG CTC CAC AGA GCC 

330 

met pro gin thr pro his arg 
ATG CCC CAA ACT CCC CAT CGA 

310 

lys phe thr val arg thr arg 
AAG TTC ACC GTC CGA ACA AGG 

360 

asn glu ser leu thr val giu 
AAT GAG TCA CTG ACT GTG GAA 

370 

gin leu gin gly phe arg lys 
CAA TTA CAA GGC ' TTC CGG AAG 

390 

lys thr lou thr pro glu lys 



ala lys gly lys thr pro ser leu 
GCC AAA GGG AAG ACA CCC TCT CTG 

200 

gin lys ile leu gin glu thr leu 
CAG AAG ATT CTG CAG GAA ACT CTC 



lys glu val leu asp ala ser lys 
AAG GAG GTG CTG GAT GCC TCC AAA 

230 

thr leu ile glu leu leu 'leu pro 
ACC CTA ATC GAG CTA CTG CTG CCA 



gin gin gin lys ala cys ile arg 
CAG CAG CAA AAA GCC TGC ATC AGA 

260' 

glu gin leu glu thr trp phe thr 
GAA CAG CTG GAG ACA TGG TTC ACA 



his leu arg gin leu leu lys glu 
CAC CTG AGG CAG CTG CTG AAG GAG 

290 

val ser tyr gin asp asp pro leu 
GTT AGC TAT CAG GAT GAC CCT CTG 



asn ala gin val thr glu leu leu 

AAC GCC CAG GTC ACA GAG TTG CTA 

320 

phe val val glu thr gin pro cys 

TTT GTG GTA GAA ACC CAG CCC TGC 



pro leu ile leu Xys thr gly ser 
CCC CTC ATC CTC AAG ACT GGC AGC 

350 

leu leu val arg leu gin glu gly 
CTG CTG GTG AGA CTC CAG GAA GGC 



val ser ile asp arg asn pro pro 
GTC TCC ATT GAC AGG AAT ^ CCT CCT 

300 

phe asn ile leu thr ser asn gin 
TTC AAC ATT CTG ACT TCA AAC CAG 



gly gin ser gin gly leu ile trp 
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AAA ACT TTG ACC CCC 

asp phe gly tyr leu 
GAC TTT GGT TAG CTG 



gly lya igly ser asn 

GGA AAG GGC AGC AAT 

430 

his lie lie ser phe 

CAC ATC ATC AGC TTC 



gin glu leu lys thr 
CAG GAG CTG AAA ACG 

460 

met asn gin leu ser 
ATG AAC CAG CTC TCA 



leu leu ser pro asn 
TTG CTC AGC CCA AAC 

490 

pro lys ala pro trp 
CCC AAG GCC CCC TGG 



phe ser ser tyr val 
TTC TCC TCC TAT GTT 

520 

met leu arg asn lys 
ATG CTG AGA AAC AAG 



pro. leu leu ser trp 

CCA TTA TTG TCC TGG 

550 

gly lys leu pro phe 

GGC AAG TTA CCA TTC 



val his asp his leu 

GTA CAT GhC CAC CTG 

500 

gly phe val ser arg 

GGC TTT GTG AGT CGG 



met ser gly thr phe 
ATG TCT GGC ACC TTT 



GAG AAG GGG CAG AGT 



thr leu val glu gin 
ACT CTG GTG GAG CAA 

420 

lya gly pro leu gly 
AAG GGG CCA CTA GGT 



thr val lys tyr thr 
ACG GTC AAA TAT ACC 

450 

asp thr leu pro val 
GAC ACC CTC CCT GTG 



ile ala trp ala ser 
ATT GCC TGG GCT TCA 

400 

leu gin asn gin gin 
CTT CAG AAC CAG CAG 



ser leu leu gly pro 

AGC TTG CTG GGC CCT 

510 

gly arg gly leu asn 

GGC CGA GGC CTC AAC 



leu pile gly gin asn 
CTG TTC GGG CAG AAC 

540 

ala asp phe thr lys 
GCT GAC TTC ACT AAG 



trp thr trp leu asp 

TGG ACA TGG CTG GAC 

570 

lys asp leu trp asn 

AAG GAT CTC TGG AAT 



ser gin glu arg arg 
AGC CAG GAG CGC CGG 

600 

leu leu arg phe ser 
CTA CTG CGC TTC AGT 



CAG GGT TTG ATT TGG 
410 

org ser gly gly ser 
CGT TCA GGT GGT TCA 



val thr glu glu leu 
GTG ACA GAG GAA CTG 

440 

tyr gin gly leu lya 
TAG CAG GGT CTG' AAG 



val ile ile ser asn 
GTG ATT ^TT TCC AAC 

470 

val leu trp phe asn 
GTT CTC 'TGG TTC AAT 



phe phe ser asn pro 
TTC TTC TCC AAC CCC 

500 

ala leu ser trp gin 
GCT CTC AGT TGG CAG 



ser asp gin leu ser 
TCA GAC: CAG CTG AGC 

530 

cys arg thr glu asp 
TGT AGG ACT GAG GAT 



arg glu ser pro pro 
CGA GAG AGC CCT CCT 

5,60 

lys ile leu glu leu 
AAA ATT CTG GAG TTG 



asp gly arg ile met 
GAT GGA CGC ATC ATG 

590 

leu leu lys I'ys thr 
CTG CTG AAG AAG ACC 



glu ser ser glu gly 
GAA TCG TCA GAA GGG 
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610 



620 



gly ile thr cya ser trp val glu his gin asp asp aap lys val 
GGC ATT ACC TGC TCC TGG GTG GAG CAC CAG GAT GAT GAG AAG GTG 

630 

leu ile tyr ser val gin pro tyr thr lys cjlu val leu gin ser 
CTC ATC TAG TOT GTG CAA CCG TAG ACG AAG GAG GTG CTG CAG TCA 

640 650 
leu pro leu thr glu ile ile arg his tyr gin leu leu thr glu 
CTC CCG CTG ACT GAA ATC ATC CGC CAT TAG CAG TTG CTC ACT GAG 

660 

glu asn lie pro glu aan pro leu arg phe leu tyr pro arg ile 
GAG AAT ATA CCT GAA'*AAC CCA CTG CGC TTC CTC TAT CCC CGA ATC 

670 ^ 6G0 

pro arg asp glu ala phe gly cya tyr tyr gin glu lys val asn 
CCC CGG GAT GAA GCT TTT GG6 TGC TAG TAG CAG GAG AAA GTT AAT 

690 

leu gin glu arg arg lys tyr leu lys his arg leu ile val val 
CTC CAG GAA CGG AGG AAA TAG CTG AAA CAC AGG CTC ATT GTG GTC 

700 710 
ser asn arg gin val asp glu leu gin gin pro leu glu leu lys 
TCT AAT AGA CAG GTG GAT GAA CTG CAA CAA CCG CTG GAG CTT AAG 



pro glu pro glu leu glu ser leu gl\i leu glu leu gly leu val 
CCA GAG CCA GAG CTG GAG TCA TTA GAG CTG GAA CTA GGG CTG GTG 



pro glu pro glu leu ser leu asp leu glu pro leu leu lys ala 
CCA GAG CCA GAG CTC AGC CTG GAC TTA GAG CCA CTG CTG AAG GCA 

750 

gly leu asp leu gly pro glu leu glu ser val leu glu ser thr 
GGG CTG GAT CTG GGG CCA GAG CTA GAG TCT GTG CTG GAG TCC ACT 

760 770 
leu glu pro val ile g]u pro tl^r leu cys met val ser gin thr 
CTG GAG CCT GTG ATA GAG CCC ACA CTA TGC ATG GTA TCA CAA ACA 

700 

val pro glu pro <isp gin gly pro val scr gin pro val pro glu 
GTG CCA GAG CCA GAC CAA GGA CCT GTA TCA CAG CCA GTG CCA GAG 

790 OOO i 

pro asp leu pro cys asp leu arg his leu asn thr glu pro met 
CCA GAT TTG CCC TGT GAT CTG AGA CAT TTG AAC ACT GAG CCA ATG 

010 

glu lie phe arg asn cys val ] ys ile glu glu ile met pro asn 
GAA ATC TTC AGA AAC TGT GTA AAG ATT GAA GAA ATC ATG CCG AAT 



720 



730 



740 
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020 830 
gly asp pro leu leu ala gly g.ln asn thr v«il asp glu val tyr 
GGT GAG CCA CTG TTG GCT GGC CAG AAC ACC GTG GAT GAG GTT TAG 

040 

val ser arg pro ser bis phe tyr thr asp gly pro^leu met pro 
GTC TCC CGC CCC AGC CAC TTC TAG ACT GAT GGA CCC TTG ATG CCT 

050 051 
ser asp phe AM 

TCT GAG TTC TAG GAACCACATTTCCTCTGTTCTTTTCATATCTCTTTGCCCTTCCTA 

ctcctcatag1:atgatattgttctccaaggatgggaatca'ggcatgtgtcccttccaagc 

TGTGTTAACTGTTCAAACTCAGGCCTGTGTGACTCCATTGGGGTGAGAGGTGAAAGCATA 

acatgggtacagaggggacaacaatgaatcagaacagatgctgagccataggtctaaata 
ggatcctggaggctgcctgctgtgctgggaggtataggggtcctgggggcaggccagggc 
agttgacaggtacttggagggctcagggcagtggcttctttccagtatggaaggatttca 
acattttaatagttggttaggctaaactggtgcatactggcattggccttggtggggagc 
acagacacaggataggactccatttctttcttccattccttcatgtctaggataacttgc 
tttcttctttcctttactcctggctcaagccctgaatttcttcttttcctgcatigggttg 
agagctttctgccttagcctaccatgtgaaactctaccctgaagaaagggatggatagga 
agtagacctctttttcttaccagtctcctcccctactctgccccctaagctggctgtacc 
tgttcctcccccataaaatgatcctgccaatctaaaatvaaaaa 
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ATTAAACCTCTCGCCGAGCCCCTCCGCAGACTCTGCGCCGGAAAGTTTCATTTGCTGTATGCCATCCTCGA 



GAGGTGTCTAGGTTAACGTTCGCACTCTGTGTATATAACCTCGACAGTCTTGGCACCTAACGTGCTGTGCG 

Met Ser Gin Trp 

TAGCTGCTCCTTTGGTTGAATCCCCAGGCCCTTGTTGGGGCACAAGGTGGCAGG ATG TCT CAG TGG 

Tyr Glu Leu Gin Gin heu Asp Ser Ly3 Phe Leu Glu Gin Val His Gin teu Tyr 
TAG GAA CTT CAG CAG CTT GAC TCA AAA TTC CTG GAG CAG GTT CAC CAG CTT TAT 

Asp Aap Ser Pha Pro Mat Glu lie Arg Gin Tyr Leu Ala Gin Trp Leu Glu Lya 
GAT GAC ACT TTT CCC ATG GAA ATC AGA CAG TAG CTG GCA CAG TGG TTA GAA AAG 

Gin Ksp Trp Glu His Ala Ala Aan Asp Val Ser Phe Ala Thr lie Arg Pha Hi5 
CAA GAC TGG GAG CAC GCT GCC AAT GAT GTT TCA TTT GCC ACC ATC CGT TTT CAT 

A3p teu Leu Ser Gin Leu Xap Asp Gin Tyr Ser Arg Phe Ser Leu Glu Aan Asn 
GAC CTC CTG TCA CAG CTG GAT GAT CAA TAT AGT CGC TTT TCT TTG GAG AAT AAC 

Pha Leu Leu Gin His Asn lie Arg Lya Ser Lys Arg Aan Leu Gin Asp Asn Phe 
TTC TTG CTA CAG CAT AAC ATA AGG AAA AGC AAG CGT AAT CTT CAG GAT AAT TTT 

Gin Glu Asp Pro lie Gin Met Ser Het He lie Tyr Ser Cya Leu Lya Glu Glu 
CAG GAA GAC CCA ATC CAG ATG TCT ATG ATC ATT TAG AGC TGT CTG AAG GAA GAA 

Arg Lya He Leu Glu Asn Ala Gin Arg Phe Asn Gin Ala Gin Ser Gly Asn He 
AGG AAA ATT CTG GAA AAC GCC CAG AGA TTT AAT CAG GCT CAG TCG GGG AAT ATT 

Gin Sec Thr Val Met Leu Asp Lys Gin Lys Glu Leu Aap Ser Lys Val Arg Aan 
CAG AGC ACA GTG ATG TTA GAC AAA CAG AAA GAG CTT GAC AGT AAA GTC AGA AAT 

Val Lys Asp Lys Val Met Cys lis Glu His Glu He Lys Ser Leu Glu Asp Leu 
GTG AAG GAC AAG GTT ATG TGT ATA GAG CAT GAA ATC AAG AGC CTG GAA GAT TTA 

Gin Asp Glu Tyr Asp Phe Lys Cys Lys Thr Leu Gin Asn Arg Glu His Glu Thr 
CAA GAT GAA TAT GAC TTC AAA TGC AAA ACC TTG CAG AAC AGA GAA CAC GAG ACC 

Aan Gly Val Ala Lys Sar Asp Gin Lys Gin Glu Gin Leu Leu Leu Lya Lys Met 
AAT GGT GTG GCA AAG AGT GAT CAG AAA CAA GAA CAG CTG TTA CTC AAG AAG ATG 

Tyr Leu Mat Leu Asp Asn Lys Arg Lya Glu Val Val His Lys He He Glu Leu 
TAT TTA ATG CTT GAC AAT AAG AGA AAG GAA GTA GTT CAC AAA ATA ATA GAG TTG 

Leu Asn Val Thr Glu Leu Thr Gin Asn Ala Leu He Asn Asp Glu Leu Val Glu 
CTG AAT GTC ACT GAA CTT ACC CAG AAT GCC CTG ATT AAT GAT GAA CTA GTG GAG 

Trp Lys Arg Arg Gin Gin Ser Ala Cys He Gly Gly Pro Pro Asn Ala Cys Leu 
TGG AAG CG6 AGA CAG CAG AGC GCC TGT ATT GGG GGG CCG CCC AAT ^GCT TGC TTG 

Asp Gin Leu Gin Asn Trp Phe Thr He Val Ala Glu Ser Leu Gin Gin Val Arg 
GAT CAG CTG CAG AAC TGG TTC ACT ATA GTT GCG GAG AGT CTG CAG CAA GTT CGG 

Gin Gin Leu Lys Lys Leu Glu Glu Leu Glu Gin Lys Tyr Thr Tyr Glu His Asp 
CAG CAG CTT AAA AAG TTG GAG GAA TTG GAA CAG AAA TAC ACC TAC GAA CAT GAC 

Pro He Thr Lys Asn Lys Gin Val Leu Trp Asp Arg Thr Phe Ser Leu Phe Gin 
CCT ATC ACA AAA AAC AAA CAA GTG TTA TGG GAC CGC ACC TTC AGT CTT TTC CAG 
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Gin Leu Ila Gin Sar Sec Phe Val Val Glu Arg Gin Pro Cya Met Pro Thr Hia 
CAG CTC ATT CAG AGC TCG TTT GTG GTG GAA AGA CAG CCC TGC ATG CCA ACG CAC 

Pro Gin Arg Pro teu Val Leu Lya Thr Giy Val Gin Phe Thr Val hya Lau Arg 
COT CAG AGG CCG CTG GTG TTG AAG ACA GGG GTG CAG TTC ACT GTG AAG TTG AGA 

Leu Leu Val Lya Leu Gin Glu Leu Aan Tyr Asn Leu Lya Val Lys Val Leu Phe 
CTG TTG GTG AAA TTG CAA GAG CTG AAT TAT AAT TTG AAA GTC AAA GTC TTA TTT 

Asp Lya Aap Val Aan Glu Arg Aan Thr Val Lya Gly Phe Arg Lya Phe Asn lie 
GAT AAA GAT GTG AAT GAG AGA AAT ACA GTA AAA GGA TTT AGG AAG TTC AAC ATT 

Leu Gly Thr His Thr Lya Val Met Asn Met Glu Glu Ser Thr Asn Gly Ser Leu 
TTG GGC ACG CAC ACA AAA GTG ATG AAC ATG GAG GAG TCC ACC AAT GGC AGT CTG 

Ala Ala Glu Phe Arg His Leu Gin Leu Lya Glu Gin Lya Aan Ala Gly Thr Arg 
GCG GCT GAA TTT CGG CAC CTG CAA TTG AAA GAA CAG AAA AAT GCT GGC ACC AGA 

Thr Aan Glu Gly Pro Leu lie Val Thr Glu Gla Leu Hia Ser Leu Ser Phe Glu 
ACG AAT GAG GGT CCT CTC ATC GTt ACT GAA GAG CTT CAC TCC CTT AGT TTT GAA 

Thr Gin Leu Cya Gin Pro Gly Leu Val He Aap Leu Glu Thr Thr Ser Leu Pro 
ACC CAA TTG TGC CAG CCT GGT TTG GTA ATT GAC CTC GAG ACG ACC TCT CTG CCC 

Val Val Val He Ser Asn Val Ser Gin Leu Pro Ser Gly Trp Ala Ser He Leu 
GTT GTG GTG ATC TCC AAC GTC AGC CAG CTC CCG AGC GGT TGG GCC TCC ATC CTT 

Trp Tyr Asn Met Leu Val Ala Glu Pro Arg Aan Leu Ser Phe Phe Leu Thr Pro 
TGG TAC AAC ATG CTG GTG GCG GAA CCC AGG AAT CTG TCC TTC TTC CTG ACT CCA 

Pro Cya Ala Arg Trp Ala Gin Leu Ser Glu Val Leu Ser Trp Gin Phe Ser Ser 
CCA TGT GCA CGA TGG GCT CAG CTT TCA GAA GTG CTG AGT TGG CAG TTT TCT TCT 

Val Thr Lys Arg Gly Leu Asn Val Asp Gin Leu Asn Met Leu Giy Glu Lys Leu 
GTC ACC AAA AGA GGT CTC AAT GTG GAC CAG CTG AAC ATG TTG GGA GAG AAG CTT 

Leu Gly Pro Asn Ala Ser Pro Asp Gly Leu He Pro Trp Thr Arg Phe Cys Lys 
CTT GGT CCT AAC GCC AGC CCC GAT GGT CTC ATT CCG TGG ACG AGG TTT TGT AAG 

Glu Asn He Asn Asp Lys Asn Phe Pro Phe Trp Leu Trp He Glu Ser Ha Leu 
GAA AAT ATA AAT GAT AAA AAT TTT CCC TTC TGG CTT TGG ATT GAA AGC ATC CTA 

Glu Leu He Lya Lys Hla Leu Leu Pro Leu Trp Asn Asp Gly Cys He Met Gly 
GAA CTC ATT AAA AAA CAC CTG CTC CCT CTC TGG AAT GAT GGG TGC ATC ATG GGC 

Phe He Ser Lys Glu Arg Glu Arg Ala Leu Leu Lya Aap Gin Gin Pro Gly Thr 
TTC ATC AGC AAG GAG CGA GAG CGT GCC CTG TTG AAG GAC CAG CAG CCG GGG ACC 

Pha Leu Leu Arg Phe Ser Glu Ser Ser Arg Glu Gly Ala He Thr Phe Thr Trp 
TTC CTG CTG CGG TTC AGT GAG AGC TCC CGG GAA GGG GCC ATC ACA TTC ACA TGG 

val Glu Arg Ser Gin Asn Gly Gly Glu Pro Asp Phe Hia Ala Val Glu Pro Tyr 
GTG GAG CGG TCC CAG AAC GGA GGC GAA CCT GAC TTC CAT GCG GTT GAA CCC TAC 

Thr Lys Lya Glu Leu Ser Ala Val Thr Phe Pro Asp He He Arg Asn Tyr Lys 
ACG AAG AAA GAA CTT TCT GCT GTT ACT TTC CCT GAC ATC ATT CGC AAT TAC AAA 

Val Met Ala Ala Glu Aan He Pro Glu Asn Pro Leu Lya Tyr Leu Tyr Pro Aan 
GTC ATG GCT GCT GAG AAT ATT CCT GAG AAT CCC CTG AAG TAT CTG TAT CCA AAT 
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TCCTTCTCATCTGTGATTCCCTCCTGCTACTCTGTTCCTTCACATCCTGTGTTTCTAGGGAAATGAAAGAA 
AGGCCAGCAAATTCGCTGCAACCTGTTGATAGCAAGTGAATTTTTCTCTAACTCAGAAACATCAGTTACTC 
TGAAGGGCATCATGCATCTTACTGAAGGTAAAATTGAAAGGCATTCTCTGAAGAGTGGGTTTCACAAGTGA 
AAAACXTCCAGATACACCCAAAGTATCA6GACGAGAATGAGGGTCCTTTGGGAAAGGAGAAGT7AAGCAAC 
ATCTAGCAAATGTTATGCATAAAGTCAGTGCCCAACTGTTATAGGTTGTTGGATAAATCAGTGGTTATTTA 
GGGAACTGCTTGACGTAGGAACGGTAAATTTCTGTGGGAGAATTCTTACATGTTTTCTTTGCTTTAAGTGT 
AACTGGCAGTTTTCCATTGGTTTACCTGTGAAATAGTTCAAAGCCAAGTTTATATACAATTATATCAGTCC 
TCTTTCAAAGGTAGCCATCATGGATCTGGTAGGGGGAAAATGTGTATTTTATTACATCTTTCACATTGGCT 
ATTTAAAGACAAAGACAAATTCTGTTTCTTGAGAAGAGAATATTAGCTTTACTGTTTGTTATGGCTTAATQ 
ACACTAGCTAATATCAATAGAAGGATGTACATTTCCAAATTCACAAGTTGTGTTTGATATCCAAAGCTGAA 
TACATTCTGCTTTCATCTTGGTCACATACAATTATTTTTACAGTTCTCCCAAGGGAGTTAGGCTATTCACA 
ACCACTCATTCAAAAGTTGAAATTAACCATAGATGTAGATAAACTCAGAAATTTAATTCAT^TTTCTTAAA 
^TGGGCTACTTTGTCCTTTTTGTTATTAGGGTGGTATTTAGTCTATTAGCCACAAAATTGGGAAAGGAGTAG 
AAAAAGCAGTAACTGACAACTTGAATAATACACCAGAGATAATATGAGAATCAGATCATTTCAAAACTCAT 
TTCCTATGTAACTGCATTGAGAACTGCATATGTTTCGCTGATATATGTGTTTTTCACATTTGCGAATGGTT 
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CCArrCTCTCTCCTGTACrrrrrCCAGACACrTtTTTCAGTGGATGATGTPTCGTGAAGTATACTGTATTT 
TTACCT'rTTTCCTTCCTTATCACTGACACAAAAAGTAGATTAAGAGATGCGrrTrGACAAGGTTCTTCCCTT 
TTACATACTGCTGTCTATG7GGCTGTATCTTGTTTTTCCACTACTGCTACCACAACTATATTATCATGCAA 
ATGCTGTATTCTTCTTTGGTGGAGATAAAGATTTCTTGAGTTtTGTTTTAAAATTAAAGCTAAAGTATCTG 
TATTGCATTAAATATAATATCGACACAGTGCTTTCCGTGGCACTGCATACAATCTGAGGCCTCCTCTCTCA 
GTTTTTATA7AGATGGCGAGAACCTAAGTTTCAGTTGATTTTACAATTGAAATGACTAAAAAACAAAGAAG 
ACAACATTAAAAACAATATTGTTTCTA 
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ATTAAACCTCTCGCCGAGCCCCTCCGCAGACTCTGCGCCGGAAAGTTTCATTTGCTGTXTGCCATCCTCQA 



GAGCTGTCTAGGTTAACGTTCGCACTCTGTGTATATAACCTCGACAGTCTTGGCACCTAACGTGCTGTGCG 

Met Sor Gin Trp 

TAGCTGCTCCtTTGGTTGAATCCCCAGGCCCTTGTTGGGGCACAAGGTGGCAGG ATG TCT CAG TGG 

Tyr Glu Leu Gin Gin Leu Asp Ser Lya Phe Leu Glu Gin Val His Gin Leu Tyr 
TAG GAA CTT CAG CAG CTT GAG TCA AAA TTC CTG GAG CAG GTT CAC CAG CTT TAT 

A3p A3p Ser Pha Pro Met Glu lie Arg Gin Tyr Leu Ala Gin Trp Leu Glu Lys 
GAT GAG AGT TTT CCC ATG GAA ATC AG A CAG TAC CTG GCA CAG TGG TTA GAA AAG 

Gin Asp Trp Glu His Ala Ala Aan Asp Val Ser Phe Ala Thr'Ile Axg Phe His 
CAA GAC TGG GAG CAC GCT GCC AAT GAT GTT TCA TTT GCC ACC ATC CGT TTT CAT 

Ajp Lau Leu Ser Gin Leu Asp Asp Gin Tyr Ser Arg- Phe Sar Leu Glu Asn Asn 
GAC CTC CTG TCA CAG CTG GAT GAT CAA TAT AGT CGC TTT TCT TTG GAG AAT AAC 

Phe Leu Leu Gin His Asn lie Arg Lys Set Lys Arg Asn Leu €ln Asp Asn Phe 
TTC TTG CTA CAG CAT AAC ATA AGG AAA AGC AAG CGT AAT CTT CAG GAT AAT TTT 

Gin Glu Asp Pro lie Gin Met Ser Met lie lie Tyr Ser Cya Leu Lys Glu Glu 
CAG GAA GAC CCA ATC CAG ATG TCT ATG ATC ATT TAC AGC TGT CTG AAG GAA GAA 

Arg Lys lie Leu Glu Asn Ala Gin Arg Phe Asn Gin Ala Gin Ser Gly Asn lie 
AGG AAA ATT CTG GAA AAC GCC AG A TTT AAT CAG GCT CAG TCG GGG AAT ATT 

Gin Ser Thr Val Met Leu Asp Lys Gin Lys Glu Leu Asp Ser Lys Val Arg Asn 
CAG AGC ACA GTG ATG TTA GAC AAA CAG AAA GAG CTT GAC AGT AAA GTC AGA AAT 

Val Lys Asp Lys Val Met Cys He Glu His Glu He Lys Ser Leu Glu Asp Leu 
GTG AAG GAC AAG GTT ATG TGT ATA GAG CAT GAA ATC AAG AGC CTG GAA GAT TTA 

Gin Asp Glu Tyr Asp Phe Lys Cys Lys Thr Leu Gin Asn Arg Giu His Glu Thr 
CAA GAT GAA TAT GAC TTC AAA TGC AAA ACC TTG CAG AAC AGA GAA CAC GAG ACC 

Asn Gly Val Ala Lys Sen Asp Gin Lys Gin Glu Gin Leu Leu Leu Lys Lys Met 
AAT GGT GTG GCA AAG AGT GAT CAG AAA CAA GAA CAG CTG TTA CTC AAG AAG ATG 

Tyr Leu Met Leu Asp Asn Lys Arg Lys Glu Val Val His Lys He lie Glu Leu 
TAT TTA ATG CTT GAC AAT AAG AGA AAG GAA GTA GTT CAC AAA ATA ATA GAG TTG 

Leu Asn Val Thr Glu Leu Thr Gin Asn Ala Leu He Asn Asp Glu Leu Val Glu 
CTG AAT GTC ACT GAA CTT ACC CAG AAT GCC CTG ATT AAT GAT GAA CTA GTG GAG 

Trp Lys Arg Arg Gin Gin Ser Ala Cys He Gly Gly Pro Pro Asn Ala Cys Leu 
TGG AAG CGG AGA CAG CAG AGC GCC TGT ATT GGG GGG CCG CCC AAT GCT TGC TTG 

Asp Gin Leu Gin Asn Trp Phe Thr He Val Ala Glu Ser Leu Gin Gin Val Arg 
GAT CAG CTG CAG AAC TGG TTC ACT ATA GTT GCG GAG AGT CTG CAG CAA GTT CGG 

Gin Gin Leu Lys Lys Leu Glu Glu Leu Glu Gin Lys Tyr Thr Tyr Glu His Asp 
CAG CAG CTT AAA AAG TTG GAG GAA TTG GAA CAG AAA TAC ACC TAC GAA CAT GAC 

Pro He Thr Lys Asn Lys Gin Val Leu Trp Asp Arg Thr Phe Ser Leu Phe Gin 
CCT ATC ACA AAA AAC AAA CAA GTG TTA TGG GAC CGC ACC TTC AGT CTT TTC CAG 
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Gin Leu Ila Gin Ser Ser Phe Vai Val Glu Arg Gin Pro Cya Pi^t Pco Thr His 
CAG CTC ATT CAG AGC TCG XTTT GT<3 GTG GAA AGA CAG CCC TGC ATG CCA ACQ CAC 

Pro Gin Arg Pco Leu VaL Leu Ly3 The Gly Val Gin Phe Thr Val Lya Leu Arg 
OCT CAG AGG CCG CTG GTC TTG AAG ACA GGG GTC CAG TTC ACT GTG AAG TTG AGA 

Leu Leu Val Ly3 Leu Gin Glu Leu Aan Tyr Aan Leu Lya Val Lya Val Leu Pho 
CTG TTG GTG AAA TTG CAA GAG CTG AAT TAT AAT TTG AAA GTC AAA GTC TTA TTT 

Aap Lya Aap Val Asn Giu Arg A3n Thr Val Lya Gly Pha Arg Lya Phe Aan lie 
GAT AAA GAT GTG AAT GAG AGA AAT ACA GTA AAA GGA TTT AGG AAG TTC AAC ATT 

Leu Gly Thr His Thr Lys Val Met Aan Met Glu Glu Ser The Aan Gly Ser Leu 
TTG GGC ACG CAC ACA AAA GTG ATG AAC ATG GAG GAG TCC ACC AAT GGC AGT CTG 

Ala Ala Glu Phe Arg His Leu Gin Leu Lya Glu Gin Lya Aan Ala Gly The Arg 
GCG GCT GAA TTT CGG CAC CTG CAA TTG AAA GAA CAG AAA AAT GCT GGC ACC AGA 

Thr Aan Glu GXy Pro Leu He Val Thr Glu Glu Leu Hia Ser Leu Ser Phe Glu 
ACG AAT GAG GGT CCT CTC ATC GfT ACT GAA GAG CTT CAC TCC CTT AGT TTT GAA 

Thr Gin Leu Cya Gin Pro Gly Leu Val lie Asp Leu Glu Thr Thr Ser Leu Pro 
ACC CAA TTG TGC CAG CCT GGT TTG GTA ATT GAC CTC GAG ACG ACC TCT CTG CCC 

Val Val Val lie Ser Aan Val Ser Gin Leu Pro ser Gly Trp Ala Ser lie Leu 
GTT GTG GTG ATC TCC AAC GTC AGC CAG CTC CCG AGC GGT TGG GCG TCC ATC CTT 

Trp Tyr Aan Met Leu Vai Ala Glu Pro Arg Aan Leu Ser Phe Phe Leu Thr Pro 
TGG TAG AAC ATG CTG GTG GCG GAA CCC AGG AAT CTG TCC TTC TTC CTG ACT CCA 

Pro Cys Ala Arg Trp Ala Gin Leu Ser Glu Val Leu Ser Trp Gin Phe Ser Ser 
CCA TGT GCA CGA TGG GCT CAG CTT TCA GAA GTG CTG AGT TGG CAG TTT TCT TCT 

Val Thr Ly3 Arg Gly Leu Aan Val Asp Gin Leu Aan Met Leu Gly Glu Lys Leu 
GTC ACC AAA AGA GGT CTC AAT GTG GAC CAG CTG AAC ATG TTG GGA GAG AAG CTT 

Leu Gly Pro Aan Ala Ser Pro Asp Gly Leu He Pro Trp Thr Arg Phe Cy3 Lys 
CTT GGT CCT AAC GCC AGC CCC GAT GGT CTC ATT CCG TGG ACG AGG TTT TGT AAG 

Glu Aan He Aan Asp Lya Aan Phe Pro Phe Trp Leu Trp He Glu Ser lie Leu 
GAA AAT ATA AAT GAT AAA AAT TTT CCC TTC TGG CTT TGG ATT GAA AGC ATC CTA 

Glu Leu He Lya Lya His Leu Leu Pro Leu Trp Asn Aap Gly Cya He Met Gly 
GAA CTC ATT AAA AAA CAC CTG CTC CCT CTC TGG AAT GAT GGG TGC ATC ATG GGC 

Phe He Ser Lya Giu Arg Glu Arg Ala Leu Leu Lys Asp Gin Gin Pro Gly Thr 
TTC ATC AGC AAG GAG CGA GAG CGT GCC CTG TTG AAG GAC CAG CAG CCG GGG ACC 

Phe Leu Leu Arg Phe Ser Glu Ser Ser Arg Glu Gly Ala He Thr Phe Thr Trp 
TTC CTG CTG CGG TTC AGT GAG AGC TCC CGG GAA GGG GCC ATC ACA TTC ACA TGG 

Val Glu Arg Ser Gin Aan Gly Gly Glu Pro Aap Phe Hia Ala Val Glu Pro Tyr 
GTG GAG CGG TCC CAG AAC GGA GGC GAA CCT GAC TTC CAT GCG GTT <ihK CCC TAC 

Thr Lys Lys Glu Leu Ser Ala Val Thr Pha Pro Asp He He Arg Asn Tyr Lys 
ACG AAG AAA GAA CTT TCT GCT GTT ACT TTC CCT GAC ATC ATT CGC AAT TAC AAA 

Vai Met Ala Ala Glu Asn He Pro Glu Aan Pro Leu Lys Tyr Leu Tyr Pro Asn 
GTC ATG GCT GCT GAG AAT ATT CCT GAG AAT CCC CTG AAG TAT CTG TAT CCA AAT 
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lie Aap Lys Asp His Ala Pha Gly Lya Tyr Tyr Ser Arg Pro tys Glu Ala Pro 
ATT GAC AAA GAC CAT GCC TTT GGA AAG TAT TAG TCC AGG CCA AAG GAA GCA CCA 

Glu Pro Met Glu Leu Aap Gly Pro Lya Gly Thr Gly Tyr tie Ly3 Thr Glu L«u 
GAG CCA ATG GAA CTT GAT GGC CCT AAA GGA ACT GGA TAT ATC AAG ACT GAG TTG 

lie Sar Val Sec Glu Val 

ATT TCT GTG TCT GAA GTG TAAGTGAACACAGAAGAGTGACATGT7TACAAACCTCAAGCCAGCCT 
TGCTCCTGGCTGGGGCCTGTTGAAGATGCTTGTATTTTACTTTTCCATTGTAATTGCTATCGCCATCACAG 
CTGAACTTGTTGAGATCCCCGTGTTACTGCCTATCAGCATTTTACTACTTTAAAAAAAAAAAAAAAAGCCA 

AAAACCAAATTTGTATTTAAGGTATATAAATTTTCCCAAAACTGATACCCTTTGAAAAAGTATAAATAAAA 
TGAGCAAAAGTTGAA 
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1 HSQWyELQQLDSKFLEQVHQLYDDSrPMEIRQYl>AQWLEKQDWEHAANDV 

51 SrATIRFllDLLSQLDDQYSRFSLENNFLWUNIRKSKiU^LQDNFQEDPIQ 

101 HSMIIYSCLKEERKILENAQRFHQAQSGHIQSTVMIXIKQKELDSKVW^VK 

151 DKVHCIEHEIKSLEDLQDEYDFKCKTLQNREliETNGVAKSDQKQEQLLLK 

201 KHYLMLDNKRKEVVHKIIELLNVTELTQNALIKPELVEWKRRQQSACIGG 

251 PPNACLDQLQQVROQLKKLEELEQKYTYEIIDPITKHKQVLWDRTFSLFQQ 

301 LIQSSrWEnQPCHPTllPQRPLVLKTGVQFTVKLRLLVKLQELhm^LKVK 

3 51 VLFDKDVNERNTVKGFRKFHIU;TH;KVM>n^ESTHGSIJUa:FRIlLQL 

401 QKMAGTRTHEGPLIVTEEUISLSFETQLCQPGLVIDLETTSLPWVISKV 

451 SQLPSGWASILWY^^MLVAEPRNLSFFLTPPCARWAQLSEVLSWQFSSVTK 
127 

501 RG|,^fVDOL^[M^,GEKLLGPNASPDGLIPWTRFCK£NIKDK}^FPFWLHIESI 
tt9 

551 LELIKKIiLLPLWHDGCIHGFISKERERALLKDQQPGTFLLRFSESSREGA 
601 ITFTWVERSQHGGEPDniAVEPYTKKELSAVTFPDIIRIiYKVUAAEHlf B 
651 NPLKYl^lLiilDKDHArGKYYSRPKEAPEPHELDGPKGTGYIKTELXSVSE 
701 VHPSRLQTTDNLLPHSFEEFDEVSRIVGSVEFDSHHNTV 

4 

last amino acid o( 84 kd 
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1 MSC^ELQQIi DSKFLEQVHQ LYDDSFPMEI RQYLAQWLEK QDWEHAAYDV 

51 SFATIRFHDL LSQLDDQYSR FSLENNFLLQ HNIRKSKRNL' QDNFQEDPVQ 

101 MSMIIYNCLK EERKILENAQ RFNQAQEGNI QNTVMLDKQK ELDSKVRNVK 

151 DQVMCIEQEI KTLEELQDEY DFKCKTSQNR EGEANGVAK9- DQKQEQLLLH 

201 KMFLMLDNKR KEIIHKIREL LNSIELTQNT LINDELVEWK RRQQSACIGG 

251 PPJ^ACLDQLQ TWFTIVAETL QQIRQQLKKL EELEQKFTYE PDPITKNKQV 

301 LSDRTFLLFQ QLXQSSFWE RQPCMPTHPQ RPLVLKTGVQ FTVKSRLLVK 

351 LQESNLLTKV KCHFDKDVNE KNTVKGFRKF NILGTHTKVM NMEESTOGSL 

401 AAELRHLQLK EQKNAGNRTN EGPLrVTEEL HSLSFETQLC QPGLVIDLET 

451 TSLPVWISN VSQLPSGWAS ILWYNMLVTE PRNLSFFLNP PCAWWSQLSE 

501 VLSWQFSSVT KRGLNADQLS MLGEKLLGPN AGPDGLIPWT RFCKENINDK 

551 NFSFWPWIDT ILELIKNDLL CLWNDGCIMG FISKERERAL LKDQQPGTFL 

601 LRFSESSREG AITFTWVERS QNGGEPDFHA VEPYTKKELS AVTFPDIIRN 

t 

651 YKVMAAENIP ENPLKYLYPN IDKDHAFGKY YSRPKEAPEP MELDDPKRTG 

t 

701 YIKTELISVS EVHPSRLQTT DNLLPMSPEE FDEMSRIVGP EFDSMMSTV 
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1 caggatgtca cagtggttcg agcttcagca gctggactcc aagttcctgg 
51 agcaggtcca ccagctgtac gatgacagtt tccccatgga aatcagacag 
101 tacctggccc agtggctgga aaagcaagac tgggagcacg ctgcctatga 
151 tgtctcgttt gcgaccatcc gcttccatga cctcctctca ca^gctggacg 
201 accagtacag ccgcttttct ctggagaata atttcttgtt gcagcacaac 
251 atacggaaaa gcaagcgtaa tctccaggat aacttccaag aagatcccgt 
301 acagatgtcc atgatcatct acaactgtct gaaggaagaa aggaagattt 
351 tggaaaatgc ccaaagattt aatcaggccc aggagggaaa tattcagaac 
401 actgtgatgt tagataaaca gaaggagctg gacagtaaag tcagaaafcgt 
4 51 gaaggatcaa gtcatgtgca tagagcagga aatcaagacc ctagaagaat 
501 tacaagatga atatgacttt aaatgcaaaa cctctcagaa cagagaaggt 
551 gaagccaatg gtgtggcgaa gagcgaccaa aaacaggaac agctgctgct 
601 ccacaagatg tttttaatgc ttgacaataa gagaaaggag ataattcaca 
651 aaatcagaga gttgctgaat tccatcgagc tcactcagaa cactctgatt 
701 aatgacgagc tcgtggagtg gaagcgaagg cagcagagcg cctgcatcgg 
751 gggaccgccc aacgcctgcc tggatcagct gcaaac^tgg ttcaccattg 
801 ttgcagagac cctgcagcag atccgtcagc agcttaaaaa gctggaggag 
851 ttggaacaga aattcaccta tgagcccgac cctattacaa aaaacaagca 
901 ggtgttgtca gatcgaacct tcctcctctt ccagcagctc attcagagct 
951 ccttcgtggt agaacgacag ccgtgcatgc ccactcaccc gcagaggccc 
1001 ctggtcttga agactggggt acagttcact gtcaagtcga gactgttggt 
1051 gaaattgcaa gagfccgaatc tattaacgaa agtgaaatgt cact^itgaca 
1101 aagatgtgaa cgagaaaaac acagttaaag gatttcggaa gttcaacatc 
1151 ttgggtacgc acacaaaagt gatgaacatg gaagaatcca ccaacggaag 
1201 tctggcagct gagctccgac acctgcaact gaaggaacag aaaaacgctg 
1251 ggaacagaac taatgagggg cctctcattg tcaccgaaga acttcactct 
1301 cttagctttg aaacccagtt gtgccagcca ggcttggtga ttgacctgga 
1351 gaccacctct cttcctgtcg tggtgatctc caacgtcagc cagctcccca 
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eggtacaaca tgctggtgac agagcccagg 
ccccccgtgc gcgtggtggt cccagctctc 
tttcatcagt caccaagaga ggtctgaacg 
ggagagaagc tgctgggccc taatgctggc 
gacaaggttt tgtaaggaaa atattaatga 
cttggattga caccatccta gagctcatta 
tggaatgatg ggtgcattat gggcttcatc 
tctgctcaag gaccagcagc cagggacgtt 
gctcccggga aggggccatc acattcacat 
ggaggtgaac ctgactfccca tgccgtggag 
ttcagctgtt actttcccag atattattcg 
ccgagaacat accagagaat cccctgaagt 
aaagaccacg cctttgggaa gtattattcc 
accgatggag cttgacgacc ctaagcgaac 
tgatttctgt gtctgaagtc cacccttcta 
ctgcttccca tgtctccaga ggagtttgat 
ccccgaattt gacagtatga tgagcacagt 
ggcgaca 
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1 MSQWNQVQQL EIKFLEQVDQ FYDDNFPMEI RHLLAQWIET QDWEVASNNE 

51 TMATILLQNL LIQLDEQLGR VSKE?CNLLLI HNLKRIRKVL QGKFHGNFMH 

101 VAWISNCLR EERRIIAAAN MPIQGPLEKS LQSSSVSERQ BNVEHKVSAI 

151 KNSVQMTEQD TKYLEDLQDE FDYRYKTIQT MDQGDKNSIL WQEVLTLLQ 

201 EMLNSLDFKR KEALSKMTQI VNETDLLMNS MLLEELQDWK KRHRIACIGG 

251 PLHNGIDQLQ NCFTLLAESL FQLRQQLEKL QEQSTKMTYE GDPIPAQRAH 

301 LLERATFLIY NLFKNSFWE RHACMPTt!PQ RPMVLKTLIQ FTVKLRLLIK 

351 LPELNYQVKV KASIDKNVST LSNRRFVLCG THVKAMSSEE SS.NGSLSVEL 

401 DIATQGDEVQ WSKGNEGCH MVTEELHSIT FETQICLYGL TINLETSSLP 

451 WMISNVSQL PNAWASIIWY NVSTNDSQML VFFNNPPSVT LGQLLEVMSW 

501 QFSSYVGRGL NSEQLNMLAE KLTVQSNYND GHLTWAKFCK EHLPGKTFTF 

551 WTWLEAILDL IKKHILPLWI DGYIMGFVSK EKERLLLKDK MPGTFLLRFS 

601 ESHLGGITFT WVDQSENGEV RFHSVEPYNK GRLSALAFAD ILRDYKVIMA 

t 

651 ENIPENPIKY LYPDIPKDKA FGKHYSSQPC EVSRPTERGD KGYVPSVFIP 

701 ISTIRSDSTE PQSPSDLLPM SPSAYAVLRE NLSPTTIETA MNSPYSAE 



Figure 14A 
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1 tgccactacc tggacggaga gagagagagc agcatgtctc agtggaatca 

51 agtccaacaa ttagaaatca agtttttgga gcaagtagat cagttctatg 

101 atgacaactt tcctatggaa atccggcatc tgctagctca gtggattgag 

151 actcaagact gggaagtagc ttctaacaat gaaactatgg caacaattct 

201 gcttcaaaac ttactaaCac aattggatga acagttgggg cgggtttcca 

251 aagaaaaaaa tctgctattg attcacaatc taaagagaat tagaaaagtt 

301 cttcagggca agtttcatgg aaatccaatg catgtagctg tggtaatttc 

351 aaattgctta agggaagaga ggagaatatt ggctgcagcc aacatgccta 

401 tccagggacc tctggagaaa tccttacaga gttcttcagt ttctgaaaga 

451 caaaggaatg tggaacacaa agtgtctgcc attaaaaaca gtgtgcagat 

501 gacagaacaa gataccaaat acttagaaga cctgcaagat gagtttgact 

551 acaggtataa aacaattcag acaatggatc agggtgacaa aaacagtatc 

601 ctggtgaacc aggaagtttt gacactgctg caagaaatgc ttaatagtct 

651 ggacttcaag agaaaggaag cactcagtaa gatgacgcag atagtgaacg 

701 agacagacct gctcatgaac agcatgcttc tagaagagct gcaggactgg 

751 aaaaagcggc acaggattgc ctgcattggt ggcccgctcc acaatgggct 

801 ggaccagctt cagaactgct ttaccctact ggcagagagt cttttccaac 
851 tcagacagca actggagaaa ctacaggagc aatctactaa aatgacctat 
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901 gaaggggatc ccatccctgc tcaaagagca cacctcctgg aaagagctac 

951 cttcctgatc tacaaccttt tcaagaactc atttgtggtc gagcgacacg 

1001 catgcatgcc aacgcaccct cagaggccga tggtacttaa aaccctcatt 

1051 cagttcactg taaaactgag attactaata aaattgccgg aactaaacta 

1101 tcaggtgaaa gtaaaggcgt ccattgacaa gaatgtttca actctaagca 

1151 atagaagatt tgtgctttgt ggaactcacg tcaaagctat gtccagtgag 

1201 gaatcttcca atgggagcct ctcagtggag ttagacattg caacccaagg 

1251 agatgaagtg cagtactgga gtaaaggaaa cgagggctgc cacatggtga 

1301 cagaggagtt gcattccata acctttgaga cccagatctg cctctatggc 

1351 ctcaccatta acctagagac cagctcatta cctgtcgtga tgatttctaa 

1401 tgtcagccaa ctacctaatg catgggcatc catcatttgg tacaatgtat 

1451 caactaacga ctcccagaac ttggttttct ttaataaccc tccatctgtc 

1501 actttgggcc aactcctgga agtgatgagc tggcaatttt catcctatgt 

1551 cggtcgtggc cttaattcag agcagctcaa catgctggca gagaagctca 

1601 cagttcagtc taactacaat gatggtcacc tcacctgggc caagttctgc 

1651 aaggaacatt tgcctggcaa aacatttacc ttctggactt ggcttgaagc 

1*701 aatattggac ctaattaaaa aacatattct tcccctctgg att'gatgggt 

1751 acatcatggg atttgttagt aaagagaagg aacggcttct gctcaaagat 

1801 aaaatgcctg ggacattttt gttaagattc agtgagagcc atcttggagg 
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1851 



gataaccttc acctgggtgg 



accaatctga aaatggagaa gtgagattcc 



1901 actctgtaga accctacaac aaagggagac tgtcggctct ggccttcgct 
1951 gacatcctgc gagactacaa ggttatcatg gctgaaaaca tccctgaaaa 



2051 aacactacag ctcccagccg tgcgaagtct caagaccaac cgaacgggga 

2101 gacaagggtt acgtcccctc tgtttttatc cccatttcaa caatccgaag 

2151 cgattccacg gagccacaat ctccttcaga ccttctcccc atgtctccaa 

2201 gtgcatatgc tgtgctgaga gaaaacctga gcccaacgac aattgaaact 

2251 gcaatgaatt ccccatattc tgctgaatga cggtgcaaac ggacacttta 

2301 aagaaggaag cagatgaaac tggagagtgt tct£taccat agatcacaat 

2351 ttatttcttc ggctttgtaa atacc 



2001 



ccctctgaag tacctctacc 



ctgacattcc caaagacaaa gcctttggca 
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1 MAQWNQLQQL DTRYLKQLHQ LYSDTFPMEL RQFLAPWIES QDWAYAASKB 

51 SHATLVFHNI, LGEIDQQYSR FLQESNVLYQ HNLRRIKQFL QSRYLEKPME 

101 lARIVARCLW EESRLLQTAA TAAQQGGQAN HPTAAWTEK QQMLEQHLQD 

151 VBKRVQDLEQ KMKWENLQD DFDFNYKTLK SQGDMQDLNG NNQSVTRQKM 

201 QQLEQMLTAL DQMRRSIVSE LAGLLSAMEY VQKTLTDEEL ADWKRRPEIA 

251 CIGGPPNICL DRLENWITSL AESQLQTRQQ IKKLEELQQK VSYKGDPIVQ 

301 HRPMLEERIV ELFRNIMKSA FWERQPCMP MHPDRPLVIK TGVQFTTKVR 

351 LLVKFPELNY QLKIKVCIDK DSGDVAALRG SRKFNILGTN TKVMNMEESN 

401 NGSLSAEFKH LTLREQRCGN GGRANCDASL IVTEELHLIT FETEVYHQGL 

451 KIDLETHSLP VWISNICQM PNAWASILWY NMLTNNPKNV NFFTKPPIGT 

501 WDQVAEVLSW QFSSTTKRGL SIEQLTTLAE KLLGPGVNYS GCQITWAKFC 

551 KENMAGKGFS FWVWLDNIID LVKKYILALW NEGYIMGFIS KERERAILST 

601 KPPGTFLLRF SESSKEGGVT FTWVEKDISG KTQIQSVEPY TKQQLNNMSF 

651 AEIIMGYKIM DATNILVSPL VYLYPDIPKE EAFGKYCRPE SQEP^PEADPG 

701 SAAPYLKTKF ICVTPTTCSN TIDLPMSPRT LDSLMQFGNN GEGAEPSAGG 

751 QFESLTFDN5D LTSECATSPM 



Figure 15A 
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1 gccgcgacca gccaggccgg ccagtcgggc tcagcccgga gacagtcgag 

51 acccctgact gcagcaggat ggcfccagtgg aaccagctgc agcagctgga 

101 cacacgctac ctgaagcagc tgcaccagct gtacagcgac acgttcccca 

151 tggagctgcg gcagttcctg gcaccttgga ttgagagtca agactgggca 

201 tatgcagcca gcaaagagtc acatgccacg ttggtgtttc ataatctctt 

251 gggtgaaatt gaccagcaat atagccgatt cctgcaagag tccaatgtcc 

301 tctatcagca caaccttcga agaatcaagc agtttctgca gagcaggtat 

351 cttgagaagc caatggaaat tgcccggatc gtggcccgat gcctgtggga 

401 agagtctcgc ctcctccaga cggcagccac ggcagcccag caagggggcc 

451 aggccaacca cccaacagcc gccgtagtga cagagaagca gcagatgttg 

501 gagcagcatc ttcaggatgt ccggaagcga gtgcaggatc tagaacagaa 

551 aatgaaggtg gtggagaacc tccaggacga ctttgatttc aactacaaaa 

601 ccctcaagag ccaaggagac atgcaggatc tgaatggaaa caaccagtct 

651 gtgaccagac agaagatgca gcagctggaa cagatgctca cagccctgga 

701 ccagatgcgg agaagcattg tgagtgagct ggcggggctc ttgtcagcaa 

751 tggagtacgt gcagaagaca ctgactgatg aagagctggc tgactggaag 

801 aggcggccag agatcgcgtg catcggaggc cctcccaaca tctgcctgga 

851 ccgtctggaa aactggataa cttcattagc agaatctcaa cttcagaccc 
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901 gccaacaaat taagaaactg gaggagctgc agcagaaagt gtcctacaag 
951 ggcgacccta^ tcgtgcagca ccggcccatg ctggaggaga ggatcgtgga 
1001 gctgttcaga aacttaatga agagtgcctt cgtggtggag cggcagccct 
1051 gcatgcccat gcacccggac cggcccttag tcatcaagac tggtgtccag 
1101 tttaccacga aagtcaggtt gctggtcaaa tttcctgagt tgaattatca 
1151 gcttaaaatt aaagtgtgca ttgataaaga ctctggggat gttgctgccc 
1201 tcagagggtc tcggaaattt aacattctgg gcacgaacac aaaagtgatg 
1251 aacatggagg agtctaacaa cggcagcctg tctgcagagt tcaagcacct 
1301 gacccttagg gagcagagat gtgggaatgg aggccgtgcc aattgtgatg 
1351 cctccttgat cgtgactgag gagctgcacc tgatcacctt cgagactgag 
1401 gtgtaccacc aaggcctcaa gattgaccta gagacccact ccttgccagt 
1451 tgtggtgatc tccaacatct gtcagatgcc aaatgcttgg gcatcaatcc 
1501 tgtggtataa catgctgacc aataacccca agaacgtgaa cttcttcact 
1551 aagccgccaa ttggaacctg ggaccaagtg gccgaggtgc tcagctggca 
1601 gttctcgtcc accaccaagc gagggctgag catcgagcag ctgacaacgc 
1651 tggctgagaa gctcctaggg cctggtgtga actactcagg gtgtcagatc 
1701 acatgggcta aattctgcaa agaaaacatg gctggcaagg gctictcctt 
1751 ctgggtctgg ctagacaata tcatcgacct tgtgaaaaag tatatcttgg 
1801 ccctttggaa tgaagggtac atcatgggtt tcatcagcaa ggagcgggag 
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1851 cgggccatcc taagcacaaa gcccccgggc accttcctac tgcgcttcag 
1901 cgagagcagc aaagaaggag gggtcacttt cacttgggtg gaaaaggaca 
1951 tcagtggcaa gacccagatc cagtctgtag agccatacac caagcagcag 
2001 ctgaacaaca tgtcatttgc tgaaatcatc atgggctata agatcatgga 
2051 tgcgaccaac atcctggtgt ctccacttgt ctacctctac cccgacattc 
2101 ccaaggagga ggcatttgga aagtactgta ggcccgagag ccaggagcac 
2151 cccgaagccg acccaggtag tgctgccccg tacctgaaga ccaagttcat 
2201 ctgtgtgaca ccaacgacct gcagcaatac cattgacctg ccgatgtccc 
2251 cccgcacttt agattcattg atgcagtttg gaaataacgg tgaaggtgct 
2301 gagccctcag caggagggca gfcttgagtcg ctcacgtttg acatggatct 
2351 gacctcggag tgtgctacct cccccatgtg aggagctgaa accagaagct 
2401 gcagagacgt gacttgagac acctgccccg tgctccaccc ctaagcagcc 
2451 gaaccccata tcgtctgaaa ctcctaactt tgtggttcca gatttttttt 
2501 tttaatttcc tacttctgct atctttgggc aatctgggca ctttttaaaa 
2551 gagagaaatg agtgagtgtg ggtgataaac tgttatgtaa agaggagaga 
2601 cctctgagtc tggggatggg gctgagagca gaagggaggc aaaggggaac 
2651 acctcctgtc ctgcccgcct gccctccttt ttcagcagct cgg^ggttgg 
2701 ttgttagaca agtgcctcct ggtgcccatg gctacctgtt gccccactct 
2751 gtgagctgat accccattct gggaactcct ggctctgcac tttcaacctt 
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2801 gctaatatcc acatagaagc taggactaag cccaggaggt tcctctttaa 
2851 attaaaaaaa aaaaaaaaa 
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Name ' NA pA AA oA AB pB PC 



pD6 

stat91 (S20) S QN GGEPDFHAVEPYTKKELSAVTFP IIRNYKV MAAENIPENPL (664) 



D 

src (189) FFDNAKGL NVKHYKI RKLDS G (210) 

IcJc (169) DFDQNQGE WKHYKX RNLDN G (189) 

abl (185) EE G RVYKYHI NTASD G (200) 

p85aN (375) GG NNKLIKI FHR D G (388) 

SCR'S XXXXXXXX X 

I ) [ ] [-] [ ] 

Name CD pD pD' 



aB9 

StatSl (665) KYLY P NID K KDHAFGKYYSRP PK EA PEP M ELD GPKGTGYIKT (704) 
src (211) GFYI TSR TQF S SLQQLVAYYSKH AD GL CH RLTNVCPTS (248) 

Ick (190) GFYI SPR ITP P GLHDLVRHYTKA SDGLCT RLS RPCQTQ (227) 

abl (201) KLY7 SSE SRF N TLASLVHHHSTV ADGIilT TLH YPAPKR (238) 

p85aN (339) KYGF SDP LTF N SWSLINKYRHE S LAQYNPKLDVKL LYP (427) 

SCR'S XXX XXXXXXXXXX 

C — ] E-] [-3 t— — 3 [ 1 

Name pE EF pp oB ^G po GQ 
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SEQUENCE LISTING 



(1) GENERAL INFORMATION: 

(i) APPLICANT: Darnell Jr., James E, 

Schindler, Christian W. 
Fu, Xian-Yuan 
Wen, Zilong 
ztiong, Zhong 

(ii) TITLE OF INVENTION: RECEPTOR RECOGNITION FACTORS, PROTEIN 
SEQUENCES AND METHODS OF USE THEREOF 

(iii) NUMBER OF SEQUENCES: 25 

(iv) CORRESPONDENCE ADDRESS: 

(A) ADDRESSEE: Klauber & Jackson 

(B) STREET: 411 Hackensack: Avenue 

(C) CITY: Hackensack 

(D) STATE: New Jersey 

(E) COUNTRY: USA 

(F) ZIP: 07601 

(v) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC-DOS /MS-DOS 

(D) SOFTWARE: Patentin Release #1.0, Version #1.25 

(vi) CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: US 08/212,185 

(B) FILING DATE: ll-MAR-1994 

( C ) CLASS I F I CATION : 

(vii) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: US 07/980,498 

(B) FILING DATE: 23-NOV-1992 

(vii) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: US 07/854,296 

(B) FILING DATE: 19-MAR-1992 

(vii) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: WO US93/02569 

(B) FILING DATE: 19-MAR-1993 

(vii) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: US 08/126,588 

(B) FILING DATE: 24-SEP-1993 

(viii) ATTORNEY/AGENT INFORMATION: 

(A) NAME: Jackson Esq., David A. 

(B) REGISTRATION NUMBER: 26,742 

(C) REFERENCE/DOCKET NUMBER: 600-1-073 ClP 

(ix) TELECOMMUNICATION INFORMATION: 

(A) TELEPHONE: 201 487-5800 

(B) TELEFAX: 201 343-1684 

(C) TELEX: 133521 

(2) INFORMATION FOR SEQ ID NO:l: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3268 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : both 

(D) TOPOLOGY: unknown 



(ii) MOLECULE TYPE: cDNA 
(iii) HYPOTHETICAL: NO 
(iv) ANTI -SENSE: NO 

(vi) ORIGINAL SOURCE : 

(A) ORGANISM: Homo sapiens 

(vii) IMMEDIATE SOURCE: 

(B) CLONE: HeLa 

(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 25.. 2577 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 1 : 

ACTGCAACCC TAATCAGAGC CCAA ATG GCG CAG TGG GAA ATG CTG CAG AAT 51 

Met Ala Gin Trp Glu Met Leu Gin Asn 
1 ^ 5 

CTT GAC AGO CCC TTT CAG GAT CAG CTG CAC CAG CTT_ TAC TCG CAC AGC 99 
Leu Asp Ser Pro Phe Glri" Asp Gin Leu His Gin Leu~ Tyr Ser His Ser 
10 15 20 25 

CTC CTG CCT GTG GAC ATT CGA CAG TAC TTG GCT GTC TGG ATT GAA GAC 147 
Leu Leu Pro Val Asp lie Arg Gin Tyr Leu Ala Val Trp lie Glu Asp 
30 35 40 

CAG AAC TGG CAG GAA GCT GCA CTT GGG AGT GAT GAT TCC AAG GCT ACC 195 
Gin Asn Trp Gin Glu Ala Ala Leu Gly Ser Asp Asp Ser Lys Ala Thr 
45 50 55 

ATG CTA TTC TTC CAC TTC TTG GAT CAG CTG AAC TAT GAG TGT GGC CGT 243 
Met Leu Phe Phe His Phe Leu Asp Gin Leu Asn Tyr Glu Cys Gly Arg 
60 65 70 

TGC AGC CAG GAC CCA GAG TCC TTG TTG CTG CAG CAC AAT TTG CGG AAA 2 91 

Cys Ser Gin Asd Pro Glu Ser Leu Leu Leu Gin His Asn Leu Arg Lys 
75 80 85 

TTC TGC CGG GAC ATT CAG CCC TTT TCC CAG GAT CCT ACC CAG TTG GCT 33 9 

Phe Cys Arg Asp lie Gin Pro Phe Ser Gin Asp Pro Thr Gin Leu Ala 
90 95 100 105 

GAG ATG ATC TTT AAC CTC CTT CTG GAA GAA AAA AGA ATT TTG ATC CAG 337 
Glu Met lie Phe Asn Leu Leu Leu Glu Glu Lys Arg lie Leu lie Gin 
110 115 120 

GCT CAG AGG GCC CAA TTG GAA CAA GGA GAG CCA GTT CTC GAA ACA CCT 435 

Ala Gin Arg Ala Gin Leu Glu Gin Gly Glu Pro Val Leu Glu Thr Pro 

125 130 .135 

1 

GTG GAG AGC CAG CAA CAT GAG ATT GAA TCC CGG ATC CTG GAT TTA AGG 483 
Val Glu Ser Gin Gin His Glu lie Glu Ser Arg lie Leu Asp Leu Arg 
140 145 150 

GCT ATG ATG GAG AAG CTG GTA AAA TCC ATC AGC CAA CTG AAA GAC CAG 531 
Ala Met Met Glu Lys Leu Val Lys Ser lie Ser Gin Leu Lys Asp Gin 
155 160 165 



CAG GAT GTC TTC TGC TTC CGA TAT AAG ATC CAG GCC AAA GGG AAG ACA 
Gin Asp Val Phe Cys Phe Arg Tyr Lys lie Gin Ala Lys Gly Lys Thr 
170 175 180 185 



579 



CCC TCT CTG GAC CCC CAT CAG ACC AAA GAG CAG AAG ATT CTG GAG GAA 627 

Pro Ser Leu Asp Pro His Gin Thr Lys Glu Gin Lys lie Leu Gin Glu 
190 195 200 

ACT CTC AAT GAA CTG GAC AAA AGG AGA AAG GAG GTG CTG GAT GCC TCC 675 

Thr Leu Asn Glu Leu Asp Lys Arg Arg Lys Glu Val Leu Asp Ala Ser 
205 210 215 

AAA GCA CTG CTA GGC CGA TTA ACT ACC CTA ATC GAG CTA CTG CTG CCA 723 

Lys Ala Leu Leu Gly Arg Leu Thr Thr Leu lie Glu Leu Leu Leu Pro 
220 225 230 

AAG TTG GAG GAG TGG AAG GCC CAG CAG CAA AAA GCC TGC ATC AGA GCT 771 

Lys Leu Glu Glu Trp Lys Ala Gin Gin Gin Lys Ala Cys lie Arg Ala 

235 240 245 

CCC ATT GAC CAC GGG TTG GAA CAG CTG GAG ACA TGG TTC ACA GCT GGA 819 

Pro lie Asp His Gly Leu Glu Gin Leu Glu Thr Trp Phe Thr Ala Gly 
250 255 260 265 

GCA AAG CTG TTG TTT CAC CTG AGG CAG CTG CTG AAG GAG CTG AAG GGA 867 

Ala Lys Leu Leu Phe His Leu Arg Gin Leu Leu Lys Glu Leu Lys Gly 
270 275 280 

CTG AGT TGC CTG GTT AGC TAT CAG GAT GAC CCT CTG ACC AAA GGG GTG 915 

Leu Ser Cys Leu Val Ser Tyr Gin Asp Asp Pro Leu Thr Lys Gly Val 
285 290 295 

GAC CTA CGC AAC GCC CAG GTC ACA GAG TTG CTA CAG CGT CTG CTC CAC 963 

Asp Leu Arg Asn Ala Gin Val Thr Glu Leu Leu Gin Arg Leu Leu His 
300 305 310 

AGA GCC TTT GTG GTA GAA ACC CAG CCC TGC ATG CCC CAA ACT CCC CAT 1011 

Arg Ala Phe Val Val Glu Thr Gin Pro Cys Met Pro Gin Thr Pro His 

315 320 325 

CGA CCC CTC ATC CTC AAG ACT GGC AGC AAG TTC ACC GTC CGA ACA AGG 1059 

Arg Pro Leu lie Leu Lys Thr Gly Ser Lys Phe Thr Val Arg Thr Arg 
330 335 340 345 

CTG CTG GTG AGA CTC CAG GAA GGC AAT GAG TCA CTG ACT GTG GAA GTC 1107 

Leu Leu Val Arg Leu Gin Glu Gly Asn Glu Ser Leu Thr Val Glu Val 
350 355 360 

TCC ATT GAC AGG AAT CCT CCT CAA TTA CAA GGC TTC CGG AAG TTC AAC 1155 

Ser lie Asp Arg Asn Pro Pro Gin Leu Gin Gly Phe Arg Lys Phe Asn 
365 370 375 

ATT CTG ACT TCA AAC CAG AAA ACT TTG ACC CCC GAG AAG GGG CAG AGT 12 03 

lie Leu Thr Ser Asn Gin Lys Thr Leu Thr Pro Glu Lys Gly Gin Ser 
380 385 390 

CAG GGT TTG ATT TGG GAC TTT GGT TAC CTG ACT CTG GTG GAG CAA CGT 1251 

Gin Gly Leu lie Trp Asp Phe Gly Tyr Leu Thr Leu Val; Glu Gin Arg 

395 400 405 

TCA GGT GGT TCA GGA AAG GGC AGC AAT AAG GGG CCA CTA GGT GTG ACA 1299 

Ser Gly Gly Ser Gly Lys Gly Ser Asn Lys Gly Pro Leu Gly Val Thr 
410 415 420 425 

GAG GAA CTG CAC ATC ATC AGC TTC ACG GTC AAA TAT ACC TAC CAG GGT 1347 

Glu Glu Leu His lie lie Ser Phe Thr Val Lys Tyr Thr Tyr Gin Gly 
430 435 440 

CTG AAG CAG GAG CTG AAA ACG GAC ACC CTC CCT GTG GTG ATT ATT TCC 13 95 

Leu Lys Gin Glu Leu Lys Thr Asp Thr Leu Pro Val Val lie lie Ser 
445 450 455 



AAC ATG AAC CAG CTC TCA ATT GCC TGG GCT TCA GTT CTC TGG TTC AAT 1443 
Asn Met Asn Gin Leu Ser lie Ala Trp Ala Ser Val Leu Trp Phe Asn 
460 465 470 

TTG CTC AGC CCA AAC CTT CAG AAC CAG CAG TTC TTC TCC AAC CCC CCC 1491 
Leu Leu Ser Pro Asn Leu Gin Asn Gin Gin Phe Phe Ser Asn Pro Pro 
475 480 485 

AAG GCC CCC TGG AGC TTG CTG GGC CCT GCT CTC AGT TGG CAG TTC TCC 153 9 

Lys Ala Pro Trp Ser Leu Leu Gly Pro Ala Leu Ser Trp Gin Phe Ser 
490 495 500 505 

TCC TAT GTT GGC CGA GGC CTC AAC TCA GAC CAG CTG AGC ATG CTG AGA 1587 
Ser Tyr Val Gly Arg Gly Leu Asn Ser Asp Gin Leu Ser Met Leu Arg 
510 515 520 

AAC AAG CTG TTC GGG CAG AAC TGT AGG ACT GAG GAT CCA TTA TTG TCC 163 5 

Asn Lys Leu Phe Gly Gin Asn Cys Arg Thr Glu Asp 'Pro Leu Leu Ser 
525 530 535 

TGG GCT GAC TTC ACT AAG CGA GAG AGC CCT CCT GGC AAG TTA CCA TTC 1683 
Trp Ala Asp Phe Thr Lys Arg Glu Ser Pro Pro Gly Lys Leu Pro Phe 
540 545 550 

TGG ACA TGG CTG GAC AAA ATT CTG GAG TTG GTA CAT GAC CAC CTG AAG 1731 
Trp Thr Trp Leu Asp Lys lie Leu Glu Leu Val His Asp His Leu Lys 
555 560 565 

GAT CTC TGG AAT GAT GGA CGC ATC ATG GGC TTT GTG AGT CGG AGC CAG 177 9 

Asp Leu Trp Asn Asp Gly Arg lie Met Gly Phe Val Ser Arg Ser Gin 
570 575 580 585 

GAG CGC CGG CTG CTG AAG AAG ACC ATG TCT GGC ACC TTT CTA CTG CGC 1827 
Glu Arg Arg Leu Leu Lys Lys Thr Met Ser Gly Thr Phe Leu Leu Arg 
590 595 600 

TTC AGT GAA TCG TCA GAA GGG GGC ATT ACC TGC TCC TGG GTG GAG CAC 18 7 5 

Phe Ser Glu Ser Ser Glu Gly Gly He Thr Cys Ser Trp Val Glu His 
605 610 615 

CAG GAT GAT GAC AAG GTG CTC ATC TAC TCT GTG CAA CCG TAG ACG AAG 1923 
Gin Asp Asp Asp Lys Val Leu He Tyr Ser Val Gin Pro Tyr Thr Lys 
620 625 630 

GAG GTG CTG CAG TCA CTC CCG CTG ACT GAA ATC ATC CGC CAT TAC CAG 1971 
Glu Val Leu Gin Ser Leu Pro Leu Thr Glu He He Arg His Tyr Gin 
635 640 645 

TTG CTC ACT GAG GAG AAT ATA CCT GAA AAC CCA CTG CGC TTC CTC TAT 2019 
Leu Leu Thr Glu Glu Asn He Pro Glu Asn Pro Leu Arg Phe Leu Tyr 
650 655 660 665 

CCC CGA ATC CCC CGG GAT GAA GCT TTT GGG TGC TAC TAC CAG GAG AAA 2067 
Pro Arg He Pro Arg Asp Glu Ala Phe Gly Cys Tyr Tyr Gin Glu Lys 
670 675 680 

GTT AAT CTC CAG GAA CGG AGG AAA TAC CTG AAA CAC AGG CTC ATT GTG 2115 
Val Asn Leu Gin Glu Arg Arg Lys Tyr Leu Lys His Arg Leu He Val 
685 690 695 

GTC TCT AAT AGA CAG GTG GAT GAA CTG CAA CAA CCG CTG GAG CTT AAG 2163 
Val Ser Asn Arg Gin Val Asp Glu Leu Gin Gin Pro Leu Glu Leu Lys 
700 705 710 

CCA GAG CCA GAG CTG GAG TCA TTA GAG CTG GAA CTA GGG CTG GTG CCA 2211 
Pro Glu Pro Glu Leu Glu Ser Leu Glu Leu Glu Leu Gly Leu Val Pro 
715 720 725 



(2) INFORMATION FOR SEQ ID NO: 2: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 8 51 amino acids 

(B) TYPE: amino acid 
( D ) TOPOLOGY : 1 inear 



2259 



2307 



2355 



GAG CCA GAG CTC AGC CTG GAC TTA GAG CCA CTG CTG AAG GCA GGG CTG 
Glu Pro Glu Leu Ser Leu Asp Leu Glu Pro Leu Leu Lys Ala Gly Leu 
"730 735 740 745 

GAT CTG GGG CCA GAG CTA GAG TCT GTG CTG GAG TCC ACT CTG GAG CCT 
Asp Leu Gly Pro Glu Leu Glu Ser Val Leu Glu Ser Thr Leu Glu Pro 
750 755 760 

GTG ATA GAG CCC ACA CTA TGC ATG GTA TCA CAA ACA GTG CCA GAG CCA 
Val He Glu Pro Thr Leu Cys Met Val Ser Gin Thr Val Pro Glu Pro 
765 770 775 

GAC CAA GGA CCT GTA TCA CAG CCA GTG CCA GAG CCA GAT TTG CCC TGT 24 03 

Asp Gin Gly Pro Val Ser Gin Pro Val Pro Glu Pro Asp Leu Pro Cvs 
780 785 790 

GAT CTG AGA CAT TTG AAC ACT GAG CCA ATG GAA ATC TTC AGA AAC TGT 24 51 

Asp Leu Arg His Leu Asn Thr Glu Pro Met Glu He Phe Arq Asn Cvs 
795 800 805 

GTA AAG ATT GAA GAA ATC ATG CCG AAT GGT GAC CCA CTG TTG GCT GGC 24 99 

Val Lys He Glu Glu He Met Pro Asn Gly Asp Pro Leu Leu Ala Glv 
810 815 820 825 

CAG AAC ACC GTG GAT GAG GTT TAC GTC TCC CGC CCC AGC CAC TTC TAG 2547 
Gin Asn Thr Val Asp Glu Val Tyr Val Ser Arg Pro Ser His Phe Tyr 
830 835 840 

ACT GAT GGA CCC TTG ATG CCT TCT GAC TTC TAGGAACCAC ATTTCCTCTG 2 597 

Thr Asp Gly Pro Leu Met Pro Ser Asp Phe 
845 850 

TTCTTTTCAT ATCTCTTTGC CCTTCCTACT CCTCATAGCA TGATATTGTT CTCCAAGGAT 2657 

GGGAATCAGG CATGTGTCCC TTCCAAGCTG TGTTAACTGT TCAAACTCAG GCCTGTGTGA 2717 

CTCCATTGGG GTGAGAGGTG AAAGCATAAC ATGGGTACAG AGGGGACAAC AATGAATCAG 2777 

AACAGATGCT GAGCCATAGG TCTAAATAGG ATCCTGGAGG CTGCCTGCTG TGCTGGGAGG 2837 

TATAGGGGTC CTGGGGGCAG GCCAGGGCAG TTGACAGGTA CTTGGAGGGC TCAGGGCAGT 2897 

GGCTTCTTTC CAGTATGGAA GGATTTCAAC ATTTTAATAG TTGGTTAGGC TAAACTGGTG 2 957 

CATACTGGCA TTGGCCTTGG TGGGGAGCAC AGACACAGGA TAGGACTCCA TTTCTTTCTT 3 017 

CCATTCCTTC ATGTCTAGGA TAACTTGCTT TCTTCTTTCC TTTACTCCTG GCTCAAGCCC 3 077 

TGAATTTCTT CTTTTCCTGC AGGGGTTGAG AGCTTTCTGC CTTAGCCTAC ' CATGTGAAAC - 3137 

TCTACCCTGA AGAAAGGGAT GGATAGGAAG TAGACCTCTT TTTCTTACCA GTCTCCTCCC 3197 

CTACTCTGCC CCCTAAGCTG GCTGTACCTG TTCCTCCCCC ATAAAATGAT CCTGCCAATC 3257 

TAAAAAAAAA A 



3268 



(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2: 



Met Ala Gin Trp Glu Met Leu Gin Asn Leu Asp Ser Pro Phe Gin Asp 
15 10 15 

Gin Leu His Gin Leu Tyr Ser His Ser Leu Leu Pro Val Asp lie Arg 
20 25 30 

Gin Tyr Leu Ala Val Trp lie Glu Asp Gin Asn Trp Gin Glu Ala Ala 
35 40 45 

Leu Gly Ser Asp Asp Ser Lys Ala Thr Met Leu Phe Phe His Phe Leu 
50 55 60 

Asp Gin Leu Asn Tyr Glu Cys Gly Arg Cys Ser Gin Asp Pro Glu Ser 
65 70 75 80 

Leu Leu Leu Gin His Asn Leu Arg Lys Phe Cys Arg Asp lie Gin Pro 
85 90 95 

Phe Ser Gin Asp Pro Thr Gin Leu Ala Glu Met lie' Phe Asn Leu Leu 
100 105 110 

Leu Glu Glu Lys Arg lie Leu lie Gin Ala Gin Arg Ala Gin Leu Glu 
115 120 125 

Glh Gly Glu Pro Val Leu Glu Thr Pro Val Glu Ser Gin Gin His Glu 
130 135 140 

lie Glu Ser Arg lie Leu Asp Leu Arg Ala Met Met Glu Lys Leu Val 
145 150 155 160 

Lys Ser lie Ser Gin Leu Lys Asp Gin Gin Asp Val Phe Cys Phe Arg 
165 170 175 

Tyr Lys lie Gin Ala Lys Gly Lys Thr Pro Ser Leu Asp Pro His Gin 
180 185 190 

Thr Lys Glu Gin Lys lie Leu Gin Glu Thr Leu Asn Glu Leu Asp Lys 
195 200 205 

Arg Arg Lys Glu Val Leu Asp Ala Ser Lys Ala Leu Leu Gly Arg Leu 
210 215 220 

Thr Thr Leu lie Glu Leu Leu Leu Pro Lys Leu Glu Glu Trp Lys Ala 
225 230 235 240 

Gin Gin Gin Lys Ala Cys lie Arg Ala Pro lie Asp His Gly Leu Glu 
245 250 255 

Gin Leu Glu Thr Trp Phe Thr Ala Gly Ala Lys Leu Leu Phe His Leu 
260 265 270 

Arg Gin Leu Leu Lys Glu Leu Lys Gly Leu Ser Cys Leu Val Ser Tyr 
275 280 285 

Gin Asp Asp Pro Leu Thr Lys Gly Val Asp Leu Arg Asn Ala Gin Val 
290 295 300 

Thr Glu Leu Leu Gin Arg Leu Leu His Arg Ala Phe Val Val Glu Thr 
305 310 315 320 

Gin Pro Cys Met Pro Gin Thr Pro His Arg Pro Leu lie Leu Lys Thr 
325 330 335 

Gly Ser Lys Phe Thr Val Arg Thr Arg Leu Leu Val Arg Leu Gin Glu 
340 345 350 

Gly Asn Glu Ser Leu Thr Val Glu Val Ser lie Asp Arg Asn Pro Pro 
355 360 365 



Gin Leu Gin Gly Phe Arg Lys Phe Asn lie Leu Thr Ser Asn Gin Lys 
370 375 380 

Thr Leu Thr Pro Glu Lys Gly Gin Ser Gin Gly Leu lie Trp Asp Phe 
3B5 390 395 400 

Gly Tyr Leu Thr Leu Val Glu Gin Arg Ser Gly Gly Ser Gly Lys Gly 
405 410 415 

Ser Asn Lys Gly Pro Leu Gly Val Thr Glu Glu Leu His lie He Ser 
420 425 430 

Phe Thr Val Lys Tyr Thr Tyr Gin Gly Leu Lys Gin Glu Leu Lys Thr 
435 440 445 

Asp Thr Leu Pro Val Val He He Ser Asn Met Asn Gin Leu Ser He 
450 455 460 

Ala Trp Ala Ser Val Leu Trp Phe Asn Leu Leu Ser Pro Asn Leu Gin 
465 470 475 4B0 

Asn Gin Gin Phe Phe Ser Asn Pro Pro Lys Ala Pro Trp Ser Leu Leu 
485 490 495 

Gly Pro Ala Leu Ser Trp Gin Phe Ser Ser Tyr Val Gly Arg Gly Leu 
500 505 510 

Asn Ser Asp Gin Leu Ser Met Leu Arg Asn Lys Leu Phe Gly Gin Asn 
515 520 525 

Cys Arg Thr Glu Asp Pro Leu Leu Ser Trp Ala Asp Phe Thr Lys Arg 
530 535 540 

Glu Ser Pro Pro Gly Lys Leu Pro Phe Trp Thr Trp Leu Asp Lys He 
545 550 555 560 

Leu Glu Leu Val His Asp His Leu Lys Asp Leu Trp Asn Asp Gly Arg 
565 570 575 

He Met Gly Phe Val Ser Arg Ser Gin Glu Arg Arg Leu Leu Lys Lys 
580 585 590 

Thr Met Ser Gly Thr Phe Leu Leu Arg Phe Ser Glu Ser Ser Glu Gly 
595 600 605 

Gly He Thr Cys Ser Trp Val Glu His Gin Asp Asp Asp Lys Val Leu 
610 615 620 

He Tyr Ser Val Gin Pro Tyr Thr Lys Glu Val Leu Gin Ser Leu Pro 
625 630 635 640 

Leu Thr Glu He He Arg His Tyr Gin Leu Leu Thr Glu Glu Asn He 
645 650 655 

\ 

Pro Glu Asn Pro Leu Arg Phe Leu Tyr Pro Arg He Pro Arg Asp Glu 
660 665 670 

Ala Phe Gly Cys Tyr Tyr Gin Glu Lys Val Asn Leu Gin Glu Arg Arg 
675 680 685 

Lys Tyr Leu Lys His Arg Leu He Val Val Ser Asn Arg Gin Val Asp 
690 695 700 

Glu Leu Gin Gin Pro Leu Glu Leu Lys Pro Glu Pro Glu Leu Glu Ser 
705 710 715 720 

Leu Glu Leu Glu Leu Gly Leu Val Pro Glu Pro Glu Leu Ser Leu Asp 
725 730 735 



( 



Leu Glu Pro Leu Leu Lys Ala Gly Leu Asp Leu Gly Pro Glu Leu Glu 
740 745 750 

Ser Vai Leu Glu Ser Thr Leu Glu Pro Val lie Glu Pro Thr Leu Cys 
755 760 765 

Met Val Ser Gin Thr Val Pro Glu Pro Asp Gin Gly Pro Val Ser Gin 
770 775 780 

Pro Val Pro Glu Pro Asp Leu Pro Cys Asp Leu Arg His Leu Asn Thr 
785 790 795 800 

Glu Pro Met Glu lie Phe Arg Asn Cys Val Lys lie Glu Glu lie Met 
805 810 815 

Pro Asn Gly Asp Pro Leu Leu Ala Gly Gin Asn Thr Val Asp Glu Val 
820 825 830 

Tyr Val Ser Arg Pro Ser His Phe Tyr Thr Asp Gly Pro Leu Met Pro 
835 840 845 

Ser Asp Phe 
850 

(2) INFORMATION FOR SEQ ID NO: 3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3 943 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : both 

(D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: CDNA 
(iii) HYPOTHETICAL: NO 
(iv) ANTI- SENSE: NO 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(vii) IMMEDIATE SOURCE: 

(B) CLONE: Human Stat91 

(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 197.. 2449 

(xi) SEQUENCE DESCRIPTION: SEQ ID N0:3: 

ATTAAACCTC TCGCCGAGCC CCTCCGCAGA CTCTGCGCCG GAAAGTTTCA TTTGCTGTAT 6 0 

GCCATCCTCG AGAGCTGTCT AGGTTAACGT TCGCACTCTG TGTATAT^C CTCGACAGTC 12 0 

TTGGCACCTA ACGTGCTGTG CGTAGCTGCT CCTTTGGTTG AATCCCCAGG CCCTTGTTGG 180 

GGCACAAGGT GGCAGG ATG TCT CAG TGG TAG GAA CTT CAG CAG CTT GAC 229 
Met Ser Gin Trp Tyr Glu Leu Gin Gin Leu Asp 
15 10 

TCA AAA TTC CTG GAG CAG GTT CAC CAG CTT TAT GAT GAC AGT TTT CCC 277 
Ser Lys Phe Leu Glu Gin Val His Gin Leu Tyr Asp Asp Ser Phe Pro 
15 20 25 

ATG GAA ATC AGA CAG TAG CTG GCA CAG TGG TTA GAA AAG CAA GAC TGG 325 
Met Glu lie Arg Gin Tyr Leu Ala Gin Trp Leu Glu Lys Gin Asp Trp 
30 35 40 



GAG 
Glu 


CAC 
His 
45 


GCT 
Ala 


GCC 
Ala 


AAT 
Asn 


GAT 
Asp 


GTT 
Val 

50 


TCA 
Ser 


TTT 
Phe 


GCC 
Ala 


ACC 

Thr 


ATC 
X X c 
55 


CGT 
Arg 


TTT 


CAT 
nlS 


GAC 
Asp 


373 


CTC 
Leu 
60 


CTG 
Leu 


TCA 
Ser 


CAG 
Gin 


CTG 
Leu 


GAT 
Asp 
65 


GAT 
Asp 


CAA 
Gin 


TAT 
Tvr 


AGT 
Ser 


CGC 
Arg 
70 


TTT 
Phe 


TCT 
Ser 


TTG 
Leu 


GAG 

VJ X u 


AAT 

7s d n 

75 


421 


AAC 
Asn 


TTC 
Phe 


TTG 
Leu 


CTA 
Leu 


CAG 
Gin 
80 


CAT 
His 


AAC 
Asn 


ATA 
He 


AGG 
Arcr 


AAA 
Lys 
85 


AGC 
Ser 


AAG 
Lys 


CGT 
Arg 


AAT 
As n 


CTT 
Leu 
90 


CAG 


469 


GAT 
Asp 


AAT 
Asn 


TTT 
Phe 


CAG 
Gin 
95 


GAA 
Glu 


GAC 
Asp 


CCA 
Pro 


ATC 
He 


CAG 
Gin 
100 


ATG 
Met 


TCT 
Ser 


ATG 


ATC 
X xe 


ATT 
X j.e 
105 


TAC 
Tyr 


AGC 
Ser 


517 


TGT 
Cys 


CTG 
Leu 


AAG 
Lys 
110 


GAA 
Glu 


GAA 
Glu 


AGG 
Arg 


AAA 
Lys 


ATT 
He 
115 


CTG 
Leu 


GAA 
Glu 


AAC 
Asn 


GCC 

X-Cl 


CAG 
Gin 
120 


AGA 
Arg 


TTT 


AAT 
Asn 


565 


CAG 
Gin 


GCT 
Ala 
125 


CAG 
Gin 


TCG 
Ser 


GGG AAT 
Gly Asn 


ATT 
He 
130 


CAG 
Gin 


AGC 
Ser 


ACA 
Thr 


GTG 
V^^ 1 


ATG 
135 


TTA GAC 
Leu Asp 


AAA 
Lys 


CAG 
vjxn 


613 


AAA 
Lvs 
140 


GAG 
Glu 


CTT 
Leu 


GAC 
Asp 


AGT 
Ser 


AAA 
Lys 
145 


GTC 
Val 


AGA 


AAT 
As n 


GTG 
Val 


AAG 
Lys 
150 


GAC 
Asp 


AAG 
Lys 


GTT 
Val 


ATG 
r'lec 


TGT 
Cys 
155 


661 


ATA 
He 


GAG 
Glu 


CAT 
His 


GAA 
Glu 


ATC 
He 
160 


AAG 
Lys 


AGC 
Ser 


CTG 
Leu 


GAA 
Glu 


GAT 
Asp 
165 


TTA 
Leu 


CAA 
Gin 


GAT 
Asp 


GAA 
Glu 


TAT 

TVT* 

lyx 
170 


GAC 
Asp 


709 


TTC 

Phe 


AAA 
Lvs 


TGC 
Cys 


AAA 
Lys 
175 


ACC 
Thr 


TTG 
Leu 


CAG 
Gin 


AAC AGA 
Asn Arg 
180 


GAA 
Glu 


CAC 
His 


GAG 

ox Li 


ACC 
Thr 


AAT 

Asn 

185 


GGT 

nil/ 
oxy 


GTG 
V a X 


757 


GCA 
Ala 


AAG 
Lvs 


AGT 
Ser 
190 


GAT 
Asp 


CAG 
Gin 


AAA 
Lys 


CAA 

Gin 


GAA 
Glu 
195 


CAG 
Gin 


CTG 
Leu 


TTA 
Leu 


CTC 
Leu 


AAG 
Lys 
200 


AAG 
Lys 


ATG 


TAT 

Tyr 


805 


TTA 
Leu 


ATG 
Met 
205 


CTT 
Leu 


GAC 
Asp 


AAT 
Asn 


AAG 
Lys 


AGA 
Arg 
210 


AAG 
Lys 


GAA 
Glu 


GTA 


GTT 


CAC 
n. X o 
215 


AAA 
Lys 


ATA 
He 


ATA 
1 xe 


GAG 


853 


TTG 
Leu 
220 


CTG 
Leu 


AAT 
Asn 


GTC 
Val 


ACT 
Thr 


GAA 
Glu 
225 


CTT 
Leu 


ACC 
Thr 


CAG 
Gin 


AAT 
Asn 


GCC 
Ala 
230 


CTG 
Leu 


ATT 
He 


AAT 
Asn 


GAT 


GAA 

X Li 

235 


901 


CTA 
Leu 


GTG 
Val 


GAG 
Glu 


TGG 


AAG 
Lys 
240 


CGG 
Arg 


AGA 
Arg 


CAG 
Gin 


CAG 
Gin 


AGC 
245 


GCC 


TGT 


ATT 
He 


GGG 
Gly 


GGG 
oxy 
250 


CCG 
Pro 


. 949 


CCC 
Pro 


AAT 
Asn 


GCT 
Ala 


TGC 
Cys 
255 


TTG 
Leu 


GAT 
Asp 


CAG 
Gin 


CTG 
Leu 


CAG 
Gin 
260 


AAC 
Asn 


TGG 

TTTi 


TTC 


ACT ATA 

Thr*^ He 
265 


GTT 

V a X 


GCG 
/ixa 


997 


GAG 
Glu 


AGT 
Ser 


CTG 
Leu 
270 


CAG 
Gin 


CAA 
Gin 


GTT 
Val 


CGG 
Arg 


CAG 
Gin 
275 


CAG 
Gin 


CTT 
Leu 


AAA 
Lys 


AAG 
Lys 


TTG 
Leu 
280 


GAG 
Glu 


GAA 

vJ X \JL 


TTG 
Leu 


1045 


GAA 
Glu 


CAG 
Gin 
285 


AAA 
Lys 


TAC 
Tyr 


ACC 
Thr 


TAC 
Tyr 


GAA 
Glu 
290 


CAT 
His 


GAC 
Asp 


CCT 
Pro 


ATC 
He 


ACA 
Thr 
295 


AAA 
Lys 


AAC 
Asn 


AAA 
Lys 


CAA 
Gin 


1093 


GTG 
Val 
300 


TTA 
Leu 


TGG 
Trp 


GAC 
Asp 


CGC 
Arg 


ACC 
Thr 
305 


TTC 
Phe 


AGT 
Ser 


CTT 
Leu 


TTC 
Phe 


CAG 
Gin 
310 


CAG 
Gin 


CTC 
Leu 


ATT 
He 


CAG 
Gin 


AGC 
Ser 
315 


1141 



TCG TTT GTG GTG GAA AGA CAG CCC TGC ATG CCA ACG CAC CCT CAG AGG 1199 
Ser Phe Val Val Glu Arg Gin Pro Cys Met Pro Thr His Pro Gin Arg 
320 325 330 

CCG CTG GTC TTG AAG ACA GGG GTC CAG TTC ACT GTG AAG TTG AGA CTG 1237 
Pro Leu Val Leu Lys Thr Gly Val Gin Phe Thr Val Lys Leu Arg Leu 
335 340 345 

TTG GTG AAA TTG CAA GAG CTG AAT TAT AAT TTG AAA GTC AAA GTC TTA 1285 
Leu Val Lys Leu Gin Glu Leu Asn Tyr Asn Leu Lys Val Lys Val Leu 
350 355 360 

TTT GAT AAA GAT GTG AAT GAG AGA AAT ACA GTA AAA GGA TTT AGG AAG 1333 
Phe Asp Lys Asp Val Asn Glu Arg Asn Thr Val Lys Gly Phe Ara Lvs 
365 370 375 

TTC AAC ATT TTG GGC ACG CAC ACA AAA GTG ATG AAC ATG GAG GAG TCC 1381 
Phe Asn lie Leu Gly Thr His Thr Lys Val Met Asn Met Glu Glu Ser 
380 385 390 395 

ACC AAT GGC AGT CTG GCG GCT GAA TTT CGG CAC CTG CAA TTG AAA GAA 1429 
Thr Asn Gly Ser Leu Ala Ala Glu Phe Arg His Leu Gin Leu Lys Glu 
400 405 410 

CAG AAA AAT GCT GGC ACC AGA ACG AAT GAG GGT CCT CTC ATC GTT ACT 1477 
Gin Lys Asn Ala Gly Thr Arg Thr Asn Glu Gly Pro Leu lie Val Thr 
415 420 425 

GAA GAG CTT CAC TCC CTT AGT TTT GAA ACC CAA TTG TGC CAG CCT GGT 1525 
Glu Glu Leu His Ser Leu Ser Phe Glu Thr Gin Leu Cys Gin Pro Gly 
430 435 440 

TTG GTA ATT GAC CTC GAG ACG ACC TCT CTG CCC GTT GTG GTG ATC TCC 1573 
Leu Val lie Asp Leu Glu Thr Thr Ser Leu Pro Val Val Val lie Ser 
445 450 455 

AAC GTC AGC CAG CTC CCG AGC GGT TGG GCC TCC ATC CTT TGG TAC AAC 1621 
Asn Val Ser Gin Leu Pro Ser Gly Trp Ala Ser lie Leu Trp Tyr Asn 
460 465 470 475 

ATG CTG GTG GCG GAA CCC AGG AAT CTG TCC TTC TTC CTG ACT CCA CCA 166 9 

Met Leu Val Ala Glu Pro Arg Asn Leu Ser Phe Phe Leu Thr Pro Pro 
480 485 490 

TGT GCA CGA TGG GCT CAG CTT TCA GAA GTG CTG AGT TGG CAG TTT TCT 1717 
Cys Ala Arg Trp Ala Gin Leu Ser Glu Val Leu Ser Trp Gin Phe Ser 
495 500 505 

TCT GTC ACC AAA AGA GGT CTC AAT GTG GAC CAG CTG AAC ATG TTG GGA 1765 
Ser Val Thr Lys Arg Gly Leu Asn Val Asp Gin Leu Asn Met Leu Gly ' 
510 515 520 

GAG AAG CTT CTT GGT CCT AAC GCC AGC CCC GAT GGT CTC ATT CCG TGG 1813 
Glu Lys Leu Leu Gly Pro Asn Ala Ser Pro Asp Gly Led lie Pro Trp 
525 530 535 

ACG AGG TTT TGT AAG GAA AAT ATA AAT GAT AAA AAT TTT CCC TTC TGG 1861 
Thr Arg Phe Cys Lys Glu Asn lie Asn Asp Lys Asn Phe Pro Phe Trp 
540 545 550 555 

CTT TGG ATT GAA AGC ATC CTA GAA CTC ATT AAA AAA CAC CTG CTC CCT 1909 
Leu Trp lie Glu Ser lie Leu Glu Leu lie Lys Lys His Leu Leu Pro 
560 565 570 

CTC TGG AAT GAT GGG TGC ATC ATG GGC TTC ATC AGC AAG GAG CGA GAG 1957 
Leu Trp Asn Asp Gly Cys lie Met Gly Phe lie Ser Lys Glu Arg Glu 
575 580 585 



CGT GCC CTG TTG AAG GAC CAG CAG CCG GGG ACC TTC CTG CTG COG TTC 2005 
Arg Ala Leu Leu Lys Asp Gin Gin Pro Gly Thr Phe Leu Leu Arg Phe 
590 595 600 

AGT GAG AGC TCC CGG GAA GGG GCC ATC ACA TTC ACA TGG GTG GAG CGG 2053 
Ser Glu Ser Ser Arg Glu Gly Ala lie Thr Phe Thr Trp Val Glu Arg 
605 610 615 

TCC CAG AAC GGA GGC GAA CCT GAC TTC CAT GCG GTT GAA CCC TAG ACG, 2101 
Ser Gin Asn Gly Gly Glu Pro Asp Phe His Ala Val Glu Pro Tyr Thr 
620 625 630 635 

AAG AAA GAA CTT TCT GCT GTT ACT TTC CCT GAC ATC ATT CGC AAT TAG 214 9 

Lys Lys Glu Leu Ser Ala Val Thr Phe Pro Asp lie lie Arg Asn Tyr 
640 645 650 

AAA GTC ATG GCT GCT GAG AAT ATT CCT GAG AAT CCC CTG AAG TAT CTG 2197 
Lys Val Met Ala Ala Glu Asn lie Pro Glu Asn Pro Leu Lys Tyr Leu 
655 660 ' 665 

TAT CCA AAT ATT GAC AAA GAC CAT GCC TTT GGA AAG TAT TAC TCC AGG 2245 
Tyr Pro Asn lie Asp Lys Asp His Ala Phe Gly Lys Tyr Tyr Ser Arg 
670 675 680 

CCA AAG GAA GCA CCA GAG CCA ATG GAA CTT GAT GGC CCT AAA GGA ACT 2293 
Pro Lys Glu Ala Pro Glu Pro Met Glu Leu Asp Gly Pro Lys Gly Thr 
685 690 695 

GGA TAT ATC AAG ACT GAG TTG ATT TCT GTG TCT GAA GTT CAC CCT TCT 2341 
Gly Tyr lie Lys Thr Glu Leu lie Ser Val Ser Glu Val His Pro Ser 
700 705 710 715 

AGA CTT CAG ACC ACA GAC AAC CTG CTC CCC ATG TCT CCT GAG GAG TTT 23 8 9 

Arg Leu Gin Thr Thr Asp Asn Leu Leu Pro Met Ser Pro Glu Glu Phe 
720 725 730 

GAC GAG GTG TCT CGG ATA GTG GGC TCT GTA GAA TTC GAC AGT ATG ATG 24 37 

Asp Glu Val Ser Arg lie Val Gly Ser Val Glu Phe Asp Ser Met Met 
735 740 745 

AAC ACA GTA TAGAGCATGA ATTTTTTTCA TCTTCTCTGG CGACAGTTTT 24 86 

Asn Thr Val 
750 



CCTTCTCATC 


TGTGATTCCC 


TCCTGCTACT 


CTGTTCCTTC 


ACATCCTGTG 


TTTCTAGGGA 


2546 


AATGAAAGAA 


AGGCCAGCAA 


ATTCGCTGCA 


ACCTGTTGAT 


AGCAAGTGAA 


TTTTTCTCTA 


2606 


ACTCAGAAAC 


ATCAGTTACT 


CTGAAGGGCA 


TCATGCATCT 


TACTGAAGGT 


AAAATTGAAA 


2666 


GGCATTCTCT 


GAAGAGTGGG 


TTTCACAAGT 


GAAAAACATC 


CAGATACACC 


CAAAGTATCA 


2726 


GGACGAGAAT 


GAGGGTCCTT 


TGGGAAAGGA 


GAAGTTAAGC 


AACATCTAGC 


AAATGTTATG 


2786 


CATAAAGTCA 


GTGCCCAACT 


GTTATAGGTT 


GTTGGATAAA 


TCAGTCGlbA 


TTTAGGGAAC 


2846 


TGCTTGACGT 


AGGAACGGTA 


AATTTCTGTG 


GGAGAATTCT 


TACATGTTTT 


CTTTGCTTTA 


2906 


AGTGTAACTG 


GCAGTTTTCC 


ATTGGTTTAC 


CTGTGAAATA 


GTTCAAAGCC 


AAGTTTATAT 


2966 


ACAATTATAT 


CAGTCCTCTT 


TCAAAGGTAG 


CCATCATGGA 


TCTGGTAGGG 


GGAAAATGTG 


3026 


TATTTTATTA 


CATCTTTCAC 


ATTGGCTATT 


TAAAGACAAA 


GACAAATTCT 


GTTTCTTGAG 


3086 


AAGAGAACAT 


TTCCAAATTC 


ACAAGTTGTG 


TTTGATATCC 


AAAGCTGAAT 


ACATTCTGCT 


3146 


TTCATCTTGG 


TCACATACAA 


TTATTTTTAC 


AGTTCTCCCA 


AGGGAGTTAG 


GCTATTCACA 


3206 


ACCACTCATT 


CAAAAGTTGA 


AATTAACCAT 


AGATGTAGAT 


AAACTCAGAA 


ATTTAATTCA 


3266 



TGTTTCTTAA 


ATGGGCTACT 


TTGTCCTTTT 


TGTTATTAGG 


GTGGTATTTA 


GTCTATTAGC 


3326 


CACAAAATTG 


GGAAAGGAGT 


AGAAAAAGCA 


GTAACTGACA 


ACTTGAATAA 


TACACCAGAG 


3386 


ATAATATGAG 


AATCAGATCA 


TTTCAAAACT 


CATTTCCTAT 


GTAACTGCAT 


TGAGAACTGC 


3446 


ATATGTTTCG 


CTGATATATG 


TGTTTTTCAC 


ATTTGCGAAT 


GGTTCCATTC 


TCTCTCCTGT 


3506 


ACTTTTTCCA 


GACACTTTTT 


TGAGTGGATG 


ATGTTTCGTG 


AAGTATACTG 


TATTTTTACC 


3566 


TTTTTCCTTC 


CTTATCACTG 


ACACAAAAAG 


TAGATTAAGA 


GATGGGTTTG 


ACAAGGTTCT 


3626 


TCCCTTTTAC 


ATACTGCTGT 


CTATGTGGCT 


GTATCTTGTT 


TTTCCACTAC 


TGCTACCACA 


3686 


ACTATATTAT 


CATGCAAATG 


CTGTATTCTT 


CTTTGGTGGA 


GATAAAGATT 


TCTTGAGTTT 


3746 


TGTTTTAAAA 


TTAAAGCTAA 


AGTATCTGTA 


TTGCATTAAA 


TATAATATCG 


ACACAGTGCT 


3806 


TTCCGTGGCA 


CTGCATACAA 


TCTGAGGCCT 


CCTCTCTCAG 


TTTTTATATA 


GATGGCGAGA 


3366 


ACCTAAGTTT 


CAGTTGATTT 


TACAATTGAA 


ATGACTAAAA 


AACAAAGAAG 


ACAACATTAA 


3926 


AAACAATATT 


GTTTCTA 










3943 



(2) INFORMATION FOR SEQ ID NO: 4: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 750 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4: 

Met Ser Gin Trp Tyr Glu Leu Gin Gin Leu Asp Ser Lys Phe Leu Glu 
15 10 15 

Gin VaL His Gin Leu Tyr Asp Asp Ser Phe Pro Met Glu lie Arg Gin 
20 25 30 

Tyr Leu Ala Gin Trp Leu Glu Lys Gin Asp Trp Glu His Ala Ala Asn 
35 40 45 

Asp Val Ser Phe Ala Thr lie Arg Phe His Asp Leu Leu Ser Gin Leu 
50 55 60 

Asp Asp Gin Tyr Ser Arg Phe Ser Leu Glu Asn Asn Phe Leu Leu Gin 
65 70 75 80 

His Asn lie Arg Lys Ser Lys Arg Asn Leu Gin Asp Asn Phe Gin Glu 
85 90 95 

Asp Pro lie Gin Met Ser Met He He Tyr Ser Cys Leu tys Glu Glu 
100 105 110 

Arg Lys He Leu Glu Asn Ala Gin Arg Phe Asn Gin Ala Gin Ser Gly 
115 120 125 

Asn He Gin Ser Thr Val Met Leu Asp Lys Gin Lys Glu Leu Asp Ser 
130 135 140 

Lys Val Arg Asn Val Lys Asp Lys Val Met Cys He Glu His Glu He 
145 150 155 160 

Lys Ser Leu Glu Asp Leu Gin Asp Glu Tyr Asp Phe Lys Cys Lys Thr 
165 170 175 



Leu Gin Asn Arg Glu His Glu Thr Asn Gly Val Ala Lys Ser Asp Gin 
180 185 190 

Lys Gin Glu Gin Leu Leu Leu Lys Lys Met Tyr Leu Met Leu Asp Asn 
195 200 205 

Lys Arg Lys Glu Val Val His Lys lie lie Glu Leu Leu Asn Val Thr 
210 215 220 

Glu Leu Thr Gin Asn Ala Leu lie Asn Asp Glu Leu Val Glu Trp Lys 
225 230 235 240 

Arg Arg Gin Gin Ser Ala Cys lie Gly Gly Pro Pro Asn Ala Cys Leu 
245 250 255 

Asp Gin Leu Gin Asn Trp Phe Thr He Val Ala Glu Ser Leu Gin Gin 
260 265 270 

Val Arg Gin Gin Leu Lys Lys Leu Glu Glu Leu Glu Gin Lys Tyr Thr 
275 280 285 

Tyr Glu His Asp Pro He Thr Lys Asn Lys Gin Val Leu Trp Asp Arq 
290 295 300 ^ 

Thr Phe Ser Leu Phe Gin Gin Leu He Gin Ser Ser Phe Val Val Glu 
305 310 315 320 

Arg Gin Pro Cys Met Pro Thr His Pro Gin Arg Pro Leu Val Leu Lys 
325 330 335 

Thr Gly Val Gin Phe Thr Val Lys Leu Arg Leu Leu Val Lys Leu Gin 
340 345 350 

Glu Leu Asn Tyr Asn Leu Lys Val Lys Val Leu Phe Asp Lys Asp Val 

355 360 365 

Asn Glu Arg Asn Thr Val Lys Gly Phe Arg Lys Phe Asn He Leu Gly 
370 375 380 

Thr His Thr Lys Val Met Asn Met Glu Glu Ser Thr Asn Gly Ser Leu 
385 390 395 400 

Ala Ala Glu Phe Arg His Leu Gin Leu Lys Glu Gin Lys Asn Ala Gly 
405 410 415 

Thr Arg Thr Asn Glu Gly Pro Leu He Val Thr Glu Glu Leu His Ser 
420 425 430 

Leu Ser Phe Glu Thr Gin Leu Cys Gin Pro Gly Leu Val He Asp Leu 
435 440 445 

Glu Thr Thr Ser Leu Pro Val Val Val He Ser Asn Val Ser Gin Leu 
450 455 460 

Pro Ser Gly Trp Ala Ser He Leu Trp Tyr Asn Met Leu -Val Ala Glu 
465 470 475 480 

Pro Arg Asn Leu Ser Phe Phe Leu Thr Pro Pro Cys Ala Arg Trp Ala 
485 490 495 

Gin Leu Ser Glu Val Leu Ser Trp Gin Phe Ser Ser Val Thr Lys Arg 
500 505 510 

Gly Leu Asn Val Asp Gin Leu Asn Met Leu Gly Glu Lys Leu Leu Gly 
515 520 525 

Pro Asn Ala Ser Pro Asp Gly Leu He Pro Trp Thr Arg Phe Cys Lys 
530 535 540 



Glu Asn lie Asn Asp Lys Asn Phe Pro Phe Trp Leu Trp lie Glu Ser 
545 550 555 560 

lie Leu Glu Leu lie Lys Lys His Leu Leu Pro Leu Trp Asn Asp Gly 
565 570 575 

Cys lie Met Gly Phe He Ser Lys Glu Arg Glu Arg Ala Leu Leu Lys 
580 535 590 

Asp Gin Gin Pro Gly Thr Phe Leu Leu Arg Phe Ser Glu Ser Ser Arg 
595 600 605 

Glu Gly Ala He Thr Phe Thr Trp Val Glu Arg Ser Gin Asn Gly Gly 
610 615 620 

Glu Pro Asp Phe His Ala Val Glu Pro Tyr Thr Lys Lys Glu Leu Ser 
625 630 635 640 

Ala Val Thr Phe Pro Asp lie He Arg Asn Tyr Lys Val Met Ala Ala 
645 650 655 

Glu Asn He Pro Glu Asn Pro Leu Lys Tyr Leu Tyr Pro Asn He Asp 
660 665 670 

Lys Asp His Ala Phe Gly Lys Tyr Tyr Ser Arg Pr© Lys Glu Ala Pro 
675 580 685 

Glu Pro Met Glu Leu Asp Gly Pro Lys Gly Thr Gly Tyr He Lys Thr 
690 695 700 

Glu Leu He Ser Val Ser Glu Val His Pro Ser Arg Leu Gin Thr Thr 
705 710 715 720 

Asp Asn Leu Leu Pro Met Ser Pro Glu Glu Phe Asp Glu Val Ser Arg 
725 730 735 

He Val Gly Ser Val Glu Phe Asp Ser Met Met Asn Thr Val 
740 745 750 

(2) INFORMATION FOR SEQ ID NO: 5: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2607 base pairs 

(B) TYPE; nucleic acid 

(C) STRANDEDNESS : both 

(D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: CDNA 
<iii) HYPOTHETICAL: NO 
(iv) ANTI -SENSE: NO 

(vi) ORIGINAL SOURCE: ] 
(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 197.. 2335 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 5 : 
ATTAAACCTC TCGCCGAGCC CCTCCGCAGA CTCTGCGCCG GAAAGTTTCA TTTGCTGTAT 
GCCATCCTCG AGAGCTGTCT AGGTTAACGT TCGCACTCTG TGTATATAAC CTCGACAGTC 
TTGGCACCTA ACGTGCTGTG CGTAGCTGCT CCTTTGGTTG AATCCCCAGG CCCTTGTTGG 



GGCACAAGGT GGCAGG ATG TCT GAG TGG TAG GAA CTT GAG GAG CTT GAG 22 9 

Met Ser Gin Trp Tyr Glu Leu Gin Gin Leu Asp 
15 10 

TCA AAA TTC CTG GAG GAG GTT GAG GAG CTT TAT GAT GAG AGT TTT CCC 277 
Ser Lys Phe Leu Glu Gin Val His Gin Leu Tyr Asp Asp Ser Phe Pro 
15 20 25 

ATG GAA ATC AGA GAG TAG CTG GCA GAG TGG TTA GAA AAG CAA GAC TGG 32 5 

Met Glu lie Arg Gin Tyr Leu Ala Gin Trp Leu Glu Lys Gin Asp Trp 
30 35 40 

GAG CAC GCT GCC AAT GAT GTT TCA TTT GCC ACC ATC CGT TTT CAT GAC 373 
Glu His Ala Ala Asn Asp Val Ser Phe Ala Thr He Arg Phe His Asp 
45 50 55 

CTC CTG TCA CAG CTG GAT GAT CAA TAT AGT CGC TTT TCT TTG GAG AAT 421 
Leu Leu Ser Gin Leu Asp Asp Gin Tyr Ser Arg Phe Ser Leu Glu Asn 
60 65 70 ^ 75 

AAC TTC TTG CTA CAG CAT AAC ATA AGG AAA AGC AAG CGT AAT CTT CAG 4 69 

Asn Phe Leu Leu Gin His Asn He Arg Lys Ser Lys Arg Asn Leu Gin 
80 85 90 

GAT AAT TTT CAG GAA GAQ. CCA ATC CAG ATG TCT ATa ATC ATT TAG AGC 517 
Asp Asn Phe Gin Glu Asp Pro lie Gin Met Ser Met He He Tyr Ser 
95 100 105 

TGT CTG AAG GAA GAA AGG AAA ATT CTG GAA AAC GCC CAG AGA TTT AAT 565 
Cys Leu Lys Glu Glu Arg Lys He Leu Glu Asn Ala Gin Arg Phe Asn 
110 115 120 

CAG GCT CAG TCG GGG AAT ATT CAG AGC ACA GTG ATG TTA GAC AAA CAG 613 
Gin Ala Gin Ser Gly Asn He Gin Ser Thr Val Met Leu Asp Lys Gin 
125 130 135 

AAA GAG CTT GAC AGT AAA GTC AGA AAT GTG AAG GAC AAG GTT ATG TGT 661 
Lys Glu Leu Asp Ser Lys Val Arg Asn Val Lys Asp Lys Val Met Cys 
140 145 150 155 

ATA GAG CAT GAA ATC AAG AGC CTG GAA GAT TTA CAA GAT GAA TAT GAC 709 
He Glu His Glu He Lys Ser Leu Glu Asp Leu Gin Asp Glu Tyr Asp 
160 165 170 

TTC AAA TGC AAA ACC TTG CAG AAC AGA GAA CAC GAG ACC AAT GGT GTG 757 
Phe Lys Cys Lys Thr Leu Gin Asn Arg Glu His Glu Thr Asn Gly Val 
175 180 185 

GCA AAG AGT GAT CAG AAA CAA GAA CAG CTG TTA CTC AAG AAG ATG TAT 805 
Ala Lys Ser Asp Gin Lys Gin Glu Gin Leu Leu Leu Lys Lys Met Tyr 
190 195 200 

TTA ATG CTT GAC AAT AAG AGA AAG GAA GTA GTT CAC AAA ATA ATA GAG 853 
Leu Met Leu Asp Asn Lys Arg Lys Glu Val Val His Lys He He Glu 
205 210 215 ; 

TTG CTG AAT GTC ACT GAA CTT ACC CAG AAT GCC CTG ATT AAT GAT GAA 901 
Leu Leu Asn Val Thr Glu Leu Thr Gin Asn Ala Leu He Asn Asp Glu 
220 225 230 235 

CTA GTG GAG TGG AAG CGG AGA CAG CAG AGC GCC TGT ATT GGG GGG CCG 949 
Leu Val Glu Trp Lys Arg Arg Gin Gin Ser Ala Cys He Gly Gly Pro 
240 245 250 

CCC AAT GCT TGC TTG GAT CAG CTG CAG AAC TGG TTC ACT ATA GTT GGG 997 
Pro Asn Ala Cys Leu Asp Gin Leu Gin Asn Trp Phe- Thr He Val Ala 
255 260 265 

GAG AGT CTG CAG CAA GTT CGG CAG CAG CTT AAA AAG TTG GAG GAA TTG 104 5 



Glu Ser Leu Gin Gin Val Arg Gin Gin Leu Lys Lys Leu Glu Glu Leu 
270 275 280 



GAA CAG AAA TAC ACC TAG GAA CAT GAG OCT ATC ACA AAA AAC AAA CAA 1093 
Glu Gin Lys Tyr Thr Tyr Glu His Asp Pro He Thr Lys Asn Lys Gin 
285 290 295 

GTG TTA TGG GAC CGC ACC TTC AGT CTT TTC CAG CAG CTC ATT CAG AGC 1141 
Val Leu Trp Asp Arg Thr Phe Ser Leu Phe Gin Gin Leu He Gin Ser 
300 305 310 315 

TCG TTT GTG GTG GAA AGA CAG CCC TGC ATG CCA ACG CAC CCT CAG AGG 118 9 

Ser Phe Val Val Glu Arg Gin Pro Cys Met Pro Thr His Pro Gin Arg 
320 325 330 

CCG CTG GTC TTG AAG ACA GGG GTC CAG TTC ACT GTG AAG TTG AGA CTG 12 3 7 

Pro Leu Val Leu Lys Thr Gly Val Gin Phe Thr Val Lys Leu Arg Leu 
335 340 345 

TTG GTG AAA TTG CAA GAG CTG AAT TAT AAT TTG AAA GTC AAA GTC TTA 12 8 5 

Leu Val Lys Leu Gin Glu Leu Asn Tyr Asn Leu Lys Val Lys Val Leu 
350 355 360 

TTT GAT AAA GAT GTG AAT GAG AGA AAT ACA GTA AAA GGA TTT AGG AAG 1333 
Phe Asp Lys Asp Val Asn Glu Arg Asn Thr Val LyS Gly Phe Ara Lvs 
365 370 375 

TTC AAC ATT TTG GGC ACG CAC ACA AAA GTG ATG AAC ATG GAG GAG TCC 1381 
Phe Asn He Leu Gly Thr His Thr Lys Val Met Asn Met Glu Glu Ser 
380 385 390 395 

ACC AAT GGC AGT CTG GCG GCT GAA TTT CGG CAC CTG CAA TTG AAA GAA 1429 
Thr Asn Gly Ser Leu Ala Ala Glu Phe Arg His Leu Gin Leu Lys Glu 
400 405 410 

CAG AAA AAT GCT GGC ACC AGA ACG AAT GAG GGT CCT CTC ATC GTT ACT 1477 
Gin Lys Asn Ala Gly Thr Arg Thr Asn Glu Gly Pro Leu He Val Thr 
415 420 425 

GAA GAG CTT CAC TCC CTT AGT TTT GAA ACC CAA TTG TGC CAG CCT GGT 1525 
Glu Glu Leu His Ser Leu Ser Phe Glu Thr Gin Leu Cys Gin Pro Gly 
430 435 440 

TTG GTA ATT GAC CTC GAG ACG ACC TCT CTG CCC GTT GTG GTG ATC TCC 1573 
Leu Val He Asp Leu Glu Thr Thr Ser Leu Pro Val Val Val He S«r 
445 450 455 

AAC GTC AGC CAG CTC CCG AGC GGT TGG GCC TCC ATC CTT TGG TAC AAC 1621 
Asn Val Ser Gin Leu Pro Ser Gly Trp Ala Ser He Leu Trp Tyr Asn 
460 465 470 475 

ATG CTG GTG GCG GAA CCC AGG AAT CTG TCC TTC TTC CTG ACT CCA CCA 166 9 

Met Leu Val Ala Glu Pro Arg Asn Leu Ser Phe Phe Leu Thr Pro Pro 
480 485 . 490 

'1 

TGT GCA CGA TGG GCT CAG CTT TCA GAA GTG CTG AGT TGG CAG TTT TCT 1717 

Cys Ala Arg Trp Ala Gin Leu Ser Glu Val Leu Ser Trp Gin Phe Ser 
495 500 505 

TCT GTC ACC AAA AGA GGT CTC AAT GTG GAC CAG CTG AAC ATG TTG GGA 1765 
Ser Val Thr Lys Arg Gly Leu Asn Val Asp Gin Leu Asn Met Leu Gly 
510 515 520 

GAG AAG CTT CTT GGT CCT AAC GCC AGC CCC GAT GGT CTC ATT CCG TGG 1813 
Glu Lys Leu Leu Gly Pro Asn Ala Ser Pro Asp Gly Leu He Pro Trp 
525 530 535 



ACG AGG TTT TGT AAG GAA AAT ATA AAT GAT AAA AAT TTT CCC TTC TGG XQSl 
Thr Arg Phe Cys Lys Glu Asn lie Asn Asp Lys Asn Phe Pro Phe Trp 
540 545 550 555 

CTT TGG ATT GAA AGO ATC CTA GAA CTC ATT AAA AAA CAC CTG CTC OCT 1909 
Leu Trp lie Glu Ser lie Leu Glu Leu lie Lys Lys His Leu Leu Pro 
560 565 570 

CTC TGG AAT GAT GGG TGC ATC ATG GGC TTC ATC AGC AAG GAG CGA GAG 1957 
Leu Trp Asn Asp Gly Cys He Met Gly Phe He Ser Lys Glu Arg Glu 
575 530 585 

CGT GCC CTG TTG AAG GAC CAG CAG CCG GGG ACC TTC CTG CTG CGG TTC 2005 
Arg Ala Leu Leu Lys Asp Gin Gin Pro Gly Thr Phe Leu Leu Arg Phe 
590 595 600 

AGT GAG AGC TCC CGG GAA GGG GCC ATC ACA TTC ACA TGG GTG GAG CGG 2 053 

Ser Glu Ser Ser Arg Glu Gly Ala He Thr Phe Thr Trp Val Glu Arg 
605 610 srs 

TCC CAG AAC GGA GGC GAA CCT GAC TTC CAT GCG GTT GAA CCC TAG ACG 2101 
Ser Gin Asn Gly Gly Glu Pro Asp Phe His Ala Val Glu Pro Tyr Thr 
620 625 630 635 

AAG AAA GAA CTT TCT GCT GTT ACT TTC CCT GAC ATC ATT CGC AAT TAG 214 9 

Lys Lys Glu Leu Ser Ala Val Thr Phe Pro Asp He He Arg Asn Tyr 
640 645 650 

AAA GTC ATG GCT GCT GAG AAT ATT CCT GAG AAT CCC CTG AAG TAT CTG 2197 
Lys Val Met Ala Ala Glu Asn He Pro Glu Asn Pro Leu Lys Tyr Leu 
655 660 665 

TAT CCA AAT ATT GAC AAA GAC CAT GCC TTT GGA AAG TAT TAG TCC AGG 224 5 

Tyr Pro Asn He Asp Lys Asp His Ala Phe Gly Lys Tyr Tyr Ser Ara 
670 675 680 

CCA AAG GAA GCA CCA GAG CCA ATG GAA CTT GAT GGC CCT AAA GGA ACT 22 93 

Pro Lys Glu Ala Pro Glu Pro Met Glu Leu Asp Gly Pro Lys Gly Thr 
685 690 695 

GGA TAT ATC AAG ACT GAG TTG ATT TCT GTG TCT GAA GTG TAAGTGAACA 2342 
Gly Tyr He Lys Thr Glu Leu He Ser Val Ser Glu Val 
700 705 710 

CAGAAGAGTG ACATGTTTAC AAACCTCAAG CCAGCCTTGC TCCTGGCTGG GGCCTGTTGA 2402 

AGATGCTTGT ATTTTACTTT TCCATTGTAA TTGCTATCGC CATCACAGCT GAACTTGTTG 24 62 

AGATCCCCGT GTTACTGCCT ATCAGCATTT TACTACTTTA AAAAAAAAAA AAAAAGCCAA 2522 

AAACCAAATT TGTATTTAAG GTATATAAAT TTTCCCAAAA CTGATACCCT TTGAAAAAGT 2 582 

ATAAATAAAA TGAGCAAAAG TTGAA 2 6 07 

1 

(2) INFORMATION FOR SEQ ID NO : 6 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 712 amino acids 

(B) TYPE: amino acid 
CD) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 



Met Ser Gin Trp Tyr Glu Leu Gin Gin Leu Asp Ser Lys Phe Leu Glu 
15 10 15 



Gin Val His Gin Leu Tyr Asp Asp Ser Phe Pro Met Glu He Arg Gin 
20 25 30 

Tyr Leu Ala Gin Trp Leu Glu Lys Gin Asp Trp Glu His Ala Ala Asn 
35 40 45 

Asp Val Ser Phe Ala Thr He Arg Phe His Asp Leu Leu Ser Gin Leu 
50 55 60 

Asp Asp Gin Tyr Ser Arg Phe Ser Leu Glu Asn Asn Phe Leu Leu Gin 
65 70 75 80 

His Asn He Arg Lys Ser Lys Arg Asn Leu Gin Asp Asn Phe Gin Glu 
85 90 95 

Asp Pro He Gin Met Ser Met He He Tyr Ser Cys Leu Lys Glu Glu 
100 105 110 

Arg Lys He Leu Glu Asn Ala Gin Arg Phe Asn Gin Ala Gin Ser Gly 
115 120 125 

Asn He Gin Ser Thr Val Met Leu Asp Lys Gin Lys Glu Leu Asp Ser 
130 135 140 

Lys Val Arg Asn Val Lys Asp Lys Val Met Cys Il« Glu His Glu He 
145 150 155 160 

Lys Ser Leu Glu Asp Leu Gin Asp Glu Tyr Asp Phe Lys Cys Lys Thr 
165 170 175 

Leu Gin Asn Arg Glu His Glu Thr Asn Gly Val Ala Lys Ser Asp Gin 
180 185 190 

Lys Gin Glu Gin Leu Leu Leu Lys Lys Met Tyr Leu Met Leu Asp Asn 
195 200 205 

Lys Arg Lys Glu Val Val His Lys He He Glu Leu Leu Asn Val Thr 
210 215 220 

Glu Leu Thr Gin Asn Ala Leu He Asn Asp Glu Leu Val Glu Trp Lys 
225 230 235 240 

Arg Arg Gin Gin Ser Ala Cys He Gly Gly Pro Pro Asn Ala Cys Leu 
245 250 255 

Asp Gin Leu Gin Asn Trp Phe Thr He Val Ala Glu Ser Leu Gin Gin 
260 265 270 

Val Arg Gin Gin Leu Lys Lys Leu Glu Glu Leu Glu Gin Lys Tyr Thr 
275 280 285 

Tyr Glu His Asp Pro He Thr Lys Asn Lys Gin Val Leu Trp Asp Arg 
290 295 300 

Thr Phe Ser Leu Phe Gin Gin Leu He Gin Ser Ser Phe iVal Val Glu 
305 310 315 320 

Arg Gin Pro Cys Met Pro Thr His Pro Gin Arg Pro Leu Val Leu Lys 
325 330 335 

Thr Gly Val Gin Phe Thr Val Lys Leu Arg Leu Leu Val Lys Leu Gin 
340 345 350 

Glu Leu Asn Tyr Asn Leu Lys Val Lys Val Leu Phe Asp Lys Asp Val 
355 360 365 

Asn Glu Arg Asn Thr Val Lys Gly Phe Arg Lys Phe Asn He Leu Gly 
370 375 380 



Thr His Thr Lys Val Met Asn Met Glu Glu Ser Thr Asn Gly Ser Leu 
395 390 395 400 

Ala Ala Glu Phe Arg His Leu Gin Leu Lys Glu Gin Lys Asn Ala Gly 
405 410 415 

Thr Arg Thr Asn Glu Gly Pro Leu lie Val Thr Glu Glu Leu His Ser 
420 425 430 

Leu Ser Phe Glu Thr Gin Leu Cys Gin Pro Gly Leu Val He Asp Leu 
435 440 445 

Glu Thr Thr Ser Leu Pro Val Val Val He Ser Asn Val Ser Gin Leu 
450 455 460 

Pro Ser Gly Trp Ala Ser He Leu Trp Tyr Asn Met Leu Val Ala Glu 
465 470 475 480 

Pro Arg Asn Leu Ser Phe Phe Leu Thr Pro Pro Cys Ala Arg Trp Ala 
485 490 495 

Gin Leu Ser Glu Val Leu Ser Trp Gin Phe Ser Ser Val Thr Lys Arg 
500 505 510 

Gly Leu Asn Val Asp Gla Leu Asn Met Leu Gly Glxr Lys Leu Leu Gly 
515 520 525 

Pro Asn Ala Ser Pro Asp Gly Leu lie Pro Trp Thr Arg Phe Cvs Lvs 
530 535 540 

Glu Asn lie Asn Asp Lys Asn Phe Pro Phe Trp Leu Trp He Glu Ser 
545 550 555 560 

He Leu Glu Leu He Lys Lys His Leu Leu Pro Leu Trp Asn Asp Gly 
565 570 575 

Cys He Met Gly Phe He Ser Lys Glu Arg Glu Arg Ala Leu Leu Lys 
580 585 590 

Asp Gin Gin Pro Gly Thr Phe Leu Leu Arg Phe Ser Glu Ser Ser Arg 
595 600 605 

Glu Gly Ala He Thr Phe Thr Trp Val Glu Arg Ser Gin Asn Gly Gly 
610 615 620 

Glu Pro Asp Phe His Ala Val Glu Pro Tyr Thr Lys Lys Glu Leu Ser 
625 630 635 640 

Ala Val Thr Phe Pro Asp He He Arg Asn Tyr Lys Val Met Ala Ala 
645 650 ' 655 

Glu Asn He Pro Glu Asn Pro Leu Lys Tyr Leu Tyr Pro Asn He Asp 
660 665 670 

Lys Asp His Ala Phe Gly Lys Tyr Tyr Ser Arg Pro Lys Glu Ala Pro 
675 680 685 

Glu Pro Met Glu Leu Asp Gly Pro Lys Gly Thr Gly Tyr He Lys Thr 
690 695 700 

Glu Leu He Ser Val Ser Glu Val 
705 710 



(2) INFORMATION FOR SEQ ID NO: 7: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2277 base pairs 

(B) TYPE: nucleic acid 



(C) STRANDEDNSSS : both 

(D) TOPOLOGY; unknown 



(ii) MOLECULE TYPE: cDNA 
(iii) HYPOTHETICAL: NO 
(iv) ANTI -SENSE: NO 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Mouse 

(vii) IMMEDIATE SOURCE: 

(B) CLONE: Murine Stat91 

(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 5 . .2251 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 7 : 

CAGG ATG TCA CAG TGG TTC GAG CTT CAG CAG CTG GAC TCC AAG TTC CTG 4 9 

Met Ser Gin Trp Phe Glu Leu Gin Gin Leu Asp Ser Lys Phe Leu 
1 ..5 10 - 15 

GAG CAG GTC CAC CAG CTG TAC GAT GAC ACT TTC CCC ATG GAA ATC AGA 97 
Glu Gin Val His .Gin Leu Tyr Asp Asp Ser Phe Pro Met Glu He Ara 
20 25 30 

CAG TAC CTG GCC CAG TGG CTG GAA AAG CAA GAC TGG GAG CAC GCT GCC 14 5 

Gin Tyr Leu Ala Gin Trp Leu Glu Lys Gin Asp Trp Glu His Ala Ala 
35 40 45 

TAT GAT GTC TCG TTT GCG ACC ATC CGC TTC CAT GAC CTC CTC TCA CAG 193 
Tyr Asp Val Ser Phe Ala Thr He Arg Phe His Asp Leu Leu Ser Gin 
50 55 60 

CTG GAC GAC CAG TAC AGC CGC TTT TCT CTG GAG AAT AAT TTC TTG TTG 241 
Leu Asp Asp Gin Tyr Ser Arg Phe Ser Leu Glu Asn Asn Phe Leu Leu 
65 70 75 

CAG CAC AAC ATA CGG AAA AGC AAG CGT AAT CTC CAG GAT AAC TTC CAA 28 9 

Gin His Asn He Arg Lys Ser Lys Arg Asn Leu Gin Asp Asn Phe Gin 
90 85 90 95 

GAA GAT CCC GTA CAG ATG TCC ATG ATC ATC TAC AAC TGT CTG AAG GAA 337 
Glu Asp Pro Val Gin Met Ser Met He He Tyr Asn Cys Leu Lys Glu 
100 105 110 

GAA AGG AAG ATT TTG GAA AAT GCC CAA AGA TTT AAT CAG GCC CAG GAG 3 85 

Glu Arg Lys He Leu Glu Asn Ala Gin Arg Phe Asn Gin Ala Gin Glu 

120 125 

GGA AAT ATT CAG AAC ACT GTG ATG TTA GAT AAA CAG AAG i GAG CTG GAC 4 33 

Gly Asn He Gin Asn Thr Val Met Leu Asp Lys Gin Lys Glu Leu Asd 
130 135 140 

AGT AAA GTC AGA AAT GTG AAG GAT CAA GTC ATG TGC ATA GAG CAG GAA 481 
Ser Lys Val Arg Asn Val Lys Asp Gin Val Met Cys He Glu Gin Glu 
145 150 155 

ATC AAG ACC CTA GAA GAA TTA CAA GAT GAA TAT GAC TTT AAA TGC AAA 529 
He Lys Thr Leu Glu Glu Leu Gin Asp Glu Tyr Asp Phe Lys Cvs Lvs 
1^0 165 170 175 

ACC TCT CAG AAC AGA GAA GGT GAA GCC AAT GGT GTG GCG AAG AGC GAC 577 
Thr Ser Gin Asn Arg Glu Gly Glu Ala Asn Gly Val Ala Lys Ser Asd 
180 185 190 



CAA AAA CAG GAA CAG CTG CTG CTC CAC AAG ATG TTT TTA ATG CTT GAC 625 
Gin Lys Gin Glu Gin Leu Leu Leu His Lys Met Phe Leu Met Leu Asp 
195 200 205 

AAT AAG AGA AAG GAG ATA ATT CAC AAA ATC AGA GAG TTG CTG AAT TCC 673 
Asn Lys Arg Lys Glu He lie His Lys He Arg Glu Leu Leu Asn Ser 
210 215 220 

ATC GAG CTC ACT CAG AAC ACT CTG ATT AAT GAC GAG CTC GTG GAG TGG 721 
He Glu Leu Thr Gin Asn Thr Leu He Asn Asp Glu Leu Val Glu Trp 
225 230 235 

AAG CGA AGG CAG CAG AGC GCC TGC ATC GGG GGA CCG CCC AAC GCC TGC 769 
Lys Arg Arg Gin Gin Ser Ala Cys He Gly Gly Pro Pro Asn Ala Cys 
240 245 250 255 

CTG GAT CAG CTG CAA ACG TGG TTC ACC ATT GTT GCA GAG ACC CTG CAG 817 
Leu Asp Gin Leu Gin Thr Trp Phe Thr He Val Ala Glu Thr Leu Gin 
260 265 . 270 

CAG ATC CGT CAG CAG CTT AAA AAG CTG GAG GAG TTG GAA CAG AAA TTC 86 5 

Gin He Arg Gin Gin Leu Lys Lys Leu Glu Glu Leu Glu Gin Lys Phe 
275 280 285 

ACC TAT GAG CCC GAC CQJ ATT ACA AAA AAC AAG CAG GTG TTG TCA GAT 913 
Thr Tyr Glu Pro Asp Pro He Thr Lys Asn Lys Gin Val Leu Ser Asp 
290 295 300 

CGA ACC TTC CTC CTC TTC CAG CAG CTC ATT CAG AGC TCC TTC GTG GTA 961 
Arg Thr Phe Leu Leu Phe Gin Gin Leu He Gin Ser Ser Phe Val Val 
305 310 315 

GAA CGA CAG CCG TGC ATG CCC ACT CAC CCG CAG AGG CCC CTG GTC TTG 1009 
Glu Arg Gin Pro Cys Met Pro Thr His Pro Gin Arg Pro Leu Val Leu 
320 325 330 335 

AAG ACT GGG GTA CAG TTC ACT GTC AAG TCG AGA CTG TTG GTG AAA TTG 1057 
Lys Thr Gly Val Gin Phe Thr Val Lys Ser Arg Leu Leu Val Lys Leu 
340 345 350 

CAA GAG TCG AAT CTA TTA ACG AAA GTG AAA TGT CAC TTT GAC AAA GAT 1105 
Gin Glu Ser Asn Leu Leu Thr Lys Val Lys Cys His Phe Asp Lys Asp 
355 360 365 

GTG AAC GAG AAA AAC ACA GTT AAA GGA TTT CGG AAG TTC AAC ATC TTG 1153 
Val Asn Glu Lys Asn Thr Val Lys Gly Phe Arg Lys Phe Asn He Leu 
370 375 380 

GGT ACG CAC ACA AAA GTG ATG AAC ATG GAA GAA TCC ACC AAC GGA AGT 1201 
Gly Thr His Thr Lys Val Met Asn Met Glu Glu Ser Thr Asn Gly Ser 
385 390 395 

CTG GCA GCT GAG CTC CGA CAC CTG CAA CTG AAG GAA CAG AAA AAC GCT 124 9 

Leu Ala Ala Glu Leu Arg His Leu Gin Leu Lys Glu Gin Lys Asn Ala 
400 405 410 \ 415 

GGG AAC AGA ACT AAT GAG GGG CCT CTC ATT GTC ACC GAA GAA CTT CAC 12 97 

Gly Asn Arg Thr Asn Glu Gly Pro Leu He Val Thr Glu Glu Leu His 
420 425 430 

TCT CTT AGC TTT GAA ACC CAG TTG TGC CAG CCA GGC TTG GTG ATT GAC 1345 
Ser Leu Ser Phe Glu Thr Gin Leu Cys Gin Pro Gly Leu Val He Asp 
•435 440 445 

CTG GAG ACC ACC TCT CTT CCT GTC GTG GTG ATC TCC AAC GTC AGC CAG 13 93 

Leu Glu Thr Thr Ser Leu Pro Val Val Val He Ser Asn Val Ser Gin 
450 455 460 



1441 



1483 



1537 



CTC CCC AGT GGC TGG GCG TCT ATC CTG TGG TAG AAC ATG CTG GTG ACA 
Leu Pro Ser Gly Trp Ala Ser He Leu Trp Tyr Asn Met Leu Val Thr 
465 470 475 

GAG CCC AGG AAT CTC TCC TTC TTC CTG AAC CCC CCG TGC GCG TGG TGG 
Glu Pro Arg Asn Leu Ser Phe Phe Leu Asn Pro Pro Cys Ala Tro Trn 
480 485 490 495 

TCC CAG CTC TCA GAG GTG TTG AGT TGG CAG TTT TCA TCA GTC ACC AAG 
Ser Gin Leu Ser Glu Val Leu Ser Trp Gin Phe Ser Ser Val Thr Lys 
500 505 510 

AGA GGT CTG AAC GCA GAC CAG CTG AGC ATG CTG GGA GAG AAG CTG CTG 158 5 

Arg Gly Leu Asn Ala Asp Gin Leu Ser Met Leu Gly Glu Lys Leu Leu 
515 520 525 

GGC CCT AAT GCT GGC CCT GAT GGT CTT ATT CCA TGG ACA AGG TTT TGT 163 3 

Gly Pro Asn Ala Gly Pro Asp Gly Leu He Pro Trp Thr Arg Phe Cvs 
530 535 . 540 

AAG GAA AAT ATT AAT GAT AAA AAT TTC TCC TTC TGG CCT TGG ATT GAC 1681 
Lys Glu Asn He Asn Asp Lys Asn Phe Ser Phe Trp Pro Trp He Asd 
545 550 555 

ACC ATC CTA GAG CTC ATT AAG AAC GAC CTG CTG TGC CTC TGG AAT GAT 172 9 

Thr He Leu Glu Leu He Lys Asn Asp Leu Leu Cys Leu Trp Asn Asd 
560 565 570 575 

GGG TGC ATT ATG GGC TTC ATC AGC AAG GAG CGA GAA CGC GCT CTG CTC 1777 
Gly Cys He Met Gly Phe He Ser Lys Glu Arg Glu Arg Ala Leu Leu 
580 585 590 

AAG GAC CAG CAG CCA GGG ACG TTC CTG CTT AGA TTC AGT GAG AGC TCC 1825 
Lys Asp Gin Gin Pro Gly Thr Phe Leu Leu Arg Phe Ser Glu Ser Ser 
595 600 605 

CGG GAA GGG GCC ATC ACA TTC ACA TGG GTG GAA CGG TCC CAG AAC GGA 1873 
Arg Glu Gly Ala He Thr Phe Thr Trp Val Glu Arg Ser Gin Asn Glv 
610 615 620 

GGT GAA CCT GAC TTC CAT GCC GTG GAG CCC TAG ACG AAA AAA GAA CTT 1921 
Gly Glu Pro Asp Phe His Ala Val Glu Pro Tyr Thr Lys Lys Glu Leu 
625 630 635 

TCA GCT GTT ACT TTC CCA GAT ATT ATT CGC AAC TAG AAA GTC ATG GCT 1969 
Ser Ala Val Thr Phe Pro Asp He He Arg Asn Tyr Lys Val Met Ala 
640 645 650 655 

GCC GAG AAC ATA CCA GAG AAT CCC CTG AAG TAT CTG TAG CCC AAT ATT 2 017 

Ala Glu Asn He Pro Glu Asn Pro Leu Lys Tyr Leu Tyr Pro Asn He 
660 665 670 

GAC AAA GAC CAC GCC TTT GGG AAG TAT TAT TCC AGA CCA AAG GAA GCA 2 065 

Asp Lys Asp His Ala Phe Gly Lys Tyr Tyr Ser Arg Pro, Lys Glu Ala 
675 680 (685 

CCA GAA CCG ATG GAG CTT GAC GAC CCT AAG CGA ACT GGA TAC ATC AAG 2113 
Pro Glu Pro Met Glu Leu Asp Asp Pro Lys Arg Thr Gly Tyr He Lvs 
690 695 700 

ACT GAG TTG ATT TCT GTG TCT GAA GTC CAC CCT TCT AGA CTT CAG ACC 2161 
Thr Glu Leu He Ser Val Ser Glu Val His Pro Ser Arg Leu Gin Thr 
705 710 715 

ACA GAC AAC CTG CTT CCC ATG TCT CCA GAG GAG TTT GAT GAG ATG TCC 2209 
Thr Asp Asn Leu Leu Pro Met Ser Pro Glu Glu Phe Asp Glu Met Ser 
720 725 730 735 



CGG ATA GTG GGC CCC GAA TTT GAG AGT ATG ATG AGC ACA GTA 
Arg He Val Gly Pro Glu Phe Asp Ser Met Met Ser Thr Val 
740 745 

TAAACACGAA TTTCTCTCTG GCGACA 



(2) INFORMATION FOR SEQ ID NO: 8: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 74 9 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8: 

Met Ser Gin Trp Phe Glu Leu Gin Gin Leu Asp Ser Lys Phe Leu Glu 
15 10 15 

Gin Val His Gin Leu Tyr Asp Asp Ser Phe Pro Met Glu He Arci Gin 
20 25 30 

Tyr Leu Ala Gin Trp Leu Glu Lys Gin Asp Trp GFu His Ala Ala Tyr 
35 40 45 

Asp Val Ser Phe Ala Thr lie Arg Phe His Asp Leu Leu Ser Gin Leu 
50 55 60 

Asp Asp Gin Tyr Ser Arg Phe Ser Leu Glu Asn Asn Phe Leu Leu Gin 

70 75 80 

His Asn He Arg Lys Ser Lys Arg Asn Leu Gin Asp Asn Phe Gin Glu 
85 90 95 

Asp Pro Val Gin Met Ser Met He He Tyr Asn Cys Leu Lys Glu Glu 
100 105 110 

Arg Lys He Leu Glu Asn Ala Gin Arg Phe Asn Gin Ala Gin Glu Gly 
115 120 125 

Asn He Gin Asn Thr Val Met Leu Asp Lys Gin Lys Glu Leu Asd Ser 
130 135 140 

Lys Val Arg Asn Val Lys Asp Gin Val Met Cys He Glu Gin Glu He 

150 155 160 

Lys Thr Leu Glu Glu Leu Gin Asp Glu Tyr Asp Phe Lys Cys Lys Thr 
165 170 175 

Ser Gin Asn Arg Glu Gly Glu Ala Asn Gly Val Ala Lys Ser Asp Gin 
180 185 190 

Lys Gin Glu Gin Leu Leu Leu His Lys Met Phe Leu Met Leu Asp Asn 
195 200 205 

Lys Arg Lys Glu He He His Lys He Arg Glu Leu Leu Asn Ser He 
210 215 220 

Glu Leu Thr Gin Asn Thr Leu He Asn Asp Glu Leu Val Glu Tro Lvs 
225 230 235 240 

Arg Arg Gin Gin Ser Ala Cys He Gly Gly Pro Pro Asn Ala Cys Leu 
245 250 255 

Asp Gin Leu Gin Thr Trp Phe Thr He Val Ala Glu Thr Leu Gin Gin 
260 265 270 



lie Arg Gin Gin Leu Lys Lys Leu Glu Glu Leu Glu Gin Lys Phe Thr 
275 280 285 

Tyr Glu Pro Asp Pro lie Thr Lys Asn Lys Gin Val Leu Ser Asp Arq 
290 295 300 

Thr Phe Leu Leu Phe Gin Gin Leu He Gin Ser Ser Phe Val Val Glu 
305 310 315 320 

Arg Gin Pro Cys Met Pro Thr His Pro Gin Arg Pro Leu Val Leu Lys 
325 330 335 

Thr Gly Val Gin Phe Thr Val Lys Ser Arg Leu Leu Val Lys Leu Gin 
340 345 350 

Glu Ser Asn Leu Leu Thr Lys Val Lys Cys His Phe Asp Lys Asp Val 
355 360 365 

Asn Glu Lys Asn Thr Val Lys Gly Phe Arg Lys Phe Asn He Leu Glv 
370 375 380 

Thr His Thr Lys Val Met Asn Met Glu Glu Ser Thr Asn Gly Ser Leu 
385 390 395 400 

Ala Ala Glu Leu Arg His Leu Gin Leu Lys Glu Gin Lys Asn Ala Gly 
405 410 415 

Asn Arg Thr Asn Glu Gly Pro Leu He Val Thr Glu Glu Leu His Ser 
420 425 430 

Leu Ser Phe Glu Thr Gin Leu Cys Gin Pro Gly Leu Val He Asp Leu 
435 440 445 

Glu Thr Thr Ser Leu Pro Val Val Val He Ser Asn Val Ser Gin Leu 
450 455 460 

Pro Ser Gly Trp Ala Ser He Leu Trp Tyr Asn Met Leu Val Thr Glu 
465 470 475 480 

Pro Arg Asn Leu Ser Phe Phe Leu Asn Pro Pro Cys Ala Trp Trp Ser 
485 490 495 

Gin Leu Ser Glu Val Leu Ser Trp Gin Phe Ser Ser Val Thr Lys Arg 
500 505 510 

Gly Leu Asn Ala Asp Gin Leu Ser Met Leu Gly Glu Lys Leu Leu Glv 
515 520 525 

Pro Asn Ala Gly Pro Asp Gly Leu He Pro Trp Thr Arg Phe Cys Lvs 
530 535 540 

Glu Asn He Asn Asp Lys Asn Phe Ser Phe Trp Pro Trp He Asp Thr 
545 550 555 560 

He Leu Glu Leu He Lys Asn Asp Leu Leu Cys Leu Trp Asn Asp Gly 
565 570 575 

Cys He Met Gly Phe He Ser Lys Glu Arg Glu Arg Ala Leu Leu Lys 
580 585 590 

Asp Gin Gin Pro Gly Thr Phe Leu Leu Arg Phe Ser Glu Ser Ser Ara 
595 600 605 

Glu Gly Ala He Thr Phe Thr Trp Val Glu Arg Ser Gin Asn Gly Glv 
610 615 620 

Glu Pro Asp Phe His Ala Val Glu Pro Tyr Thr Lys Lys Glu Leu Ser 
625 630 635 540 



Ala Val Thr Phe Pro Asp lie He Arg Asn Tyr hys Val Met Ala Ala 
645 650 655 

Glu Asn He Pro Glu Asn Pro Leu Lys Tyr Leu Tyr Pro Asn He Asp 
660 665 670 

Lys Asp His Ala Phe Gly Lys Tyr Tyr Ser Arg Pro Lys Glu Ala Pro 
675 680 685 

Glu Pro Met Glu Leu Asp Asp Pro Lys Arg Thr Gly Tyr He Lys Thr 
690 695 700 

Glu Leu He Ser Val Ser Glu Val His Pro Ser Arg Leu Gin Thr Thr 
705 710 715 720 

Asp Asn Leu Leu Pro Met Ser Pro Glu Glu Phe Asp Glu Met Ser Arg 
725 730 735 

He Val Gly Pro Glu Phe Asp Ser Met Met Ser Thr Val 

740 745 
(2) INFORMATION FOR SEQ ID NO: 9: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: "2375 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : both 

(D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTI -SENSE: NO 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Mouse 

(vii) IMMEDIATE SOURCE: 

(A) LIBRARY: splenic/thymic 

(B) CLONE: Murine 13sfl 

(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 34.. 2277 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 9 : 

TGCCACTACC TGGACGGAGA GAGAGAGAGC AGC ATG TCT CAG TGG AAT CAA GTC 54 

Met Ser Gin Trp Asn Gin Val 
1 5 

CAA CAA TTA GAA ATC AAG TTT TTG GAG CAA GTA GAT CAG TTC TAT GAT 102 
Gin Gin Leu Glu He Lys Phe Leu Glu Gin Val Asp Glri Phe Tyr Asp 
10 15 20 

GAC AAC TTT CCT ATG GAA ATC CGG CAT CTG CTA GCT CAG TGG ATT GAG 150 
Asp Asn Phe Pro Met Glu He Arg His Leu Leu Ala Gin Trp He Glu 
25 30 35 

ACT CAA GAC TGG GAA GTA GCT TCT AAC AAT GAA ACT ATG GCA ACA ATT 193 
Thr Gin Asp Trp Glu Val Ala Ser Asn Asn Glu Thr Met Ala Thr He 
40 45 50 55 



CTG CTT CAA AAC TTA CTA ATA CAA TTG GAT GAA CAG TTG GGG CGG GTT 
Leu Leu Gin Asn Leu Leu He Gin Leu Asp Glu Gin Leu Gly Arg Val 
60 65 70 



246 



TCC AAA GAA AAA AAT CTG CTA TTG ATT CAC AAT CTA AAG AGA ATT AGA 294 
Ser Lys Glu Lys Asn Leu Leu Leu lie His Asn Leu Lys Arg He Arg 
75 80 85 

AAA GTT CTT CAG GGC AAG TTT CAT GGA AAT CCA ATG CAT GTA GCT GTG 342 
Lys Val Leu Gin Gly Lys Phe His Gly Asn Pro MeC His Val Ala Val 
9Q 95 100 

GTA ATT TCA AAT TGC TTA AGG GAA GAG AGG AGA ATA TTG GCT GCA GCC 3 90 

Val He Ser Asn Cys Leu Arg Glu Glu Arg Arg He Leu Ala Ala Ala 
105 110 115 

AAC ATG CCT ATC CAG GGA CCT CTG GAG AAA TCC TTA CAG AGT TCT TCA 4 38 

Asn Met Pro He Gin Gly Pro Leu Glu Lys Ser Leu Gin Ser Ser Ser 
120 125 130 135 

GTT TCT GAA AGA CAA AGG AAT GTG GAA CAC AAA GTG TCT GCC ATT AAA 43 6 

Val Ser Glu Arg Gin Arg Asn Val Glu His Lys Val Ser Ala He Lys 
140 145 ' 150 

AAC AGT GTG CAG ATG ACA GAA CAA GAT ACC AAA TAC TTA GAA GAC CTG 534 
Asn Ser Val Gin Met Thr Glu Gin Asp Thr Lys Tyr Leu Glu Asp Leu 
155 160 165 

CAA GAT GAG TTT GAC TAC AGG TAT AAA ACA ATT CAG ACA ATG GAT CAG 582 
Gin Asp Glu Phe Asp Tyr Arg Tyr Lys Thr He Gin Thr Met Asp Gin 
170 175 180 

GGT GAC AAA AAC AGT ATC CTG GTG AAC CAG GAA GTT TTG ACA CTG CTG 630 
Gly Asp Lys Asn Ser He Leu Val Asn Gin Glu Val Leu Thr Leu Leu 
185 190 195 

CAA GAA ATG CTT AAT AGT CTG GAC TTC AAG AGA AAG GAA GCA CTC AGT 678 
Gin Glu Met Leu Asn Ser Leu Asp Phe Lys Arg Lys Glu Ala Leu Ser 
200 205 210 215 

AAG ATG ACG CAG ATA GTG AAC GAG ACA GAC CTG CTC ATG AAC AGC ATG 726 
Lys Met Thr Gin He Val Asn Glu Thr Asp Leu Leu Met Asn Ser Met 
220 225 230 

CTT CTA GAA GAG CTG CAG GAC TGG AAA AAG CGG CAC AGG ATT GCC TGC 774 
Leu Leu Glu Glu Leu Gin Asp Trp Lys Lys Arg His Arg He Ala Cys 
235 240 24S 

ATT GGT GGC CCG CTC CAC AAT GGG CTG GAC CAG CTT CAG AAC TGC TTT 822 
He Gly Gly Pro Leu His Asn Gly Leu Asp Gin Leu Gin Asn Cys Phe 
250 255 260 

ACC CTA CTG GCA GAG AGT CTT TTC CAA CTC AGA CAG CAA CTG GAG AAA 870 
Thr Leu Leu Ala Glu Ser Leu Phe Gin Leu Arg Gin Gin Leu Glu Lys 
265 270 275 

CTA CAG GAG CAA TCT ACT AAA ATG ACC TAT GAA GGG GAT CCC ATC CCT 918 
Leu Gin Glu Gin Ser Thr Lys Met Thr Tyr Glu Gly Asp, Pro He Pro 
280 285 290 \ 295 

GCT CAA AGA GCA CAC CTC CTG GAA AGA GCT ACC TTC CTG ATC TAC AAC 966 
Ala Gin Arg Ala His Leu Leu Glu Arg Ala Thr Phe Leu He Tyr Asn 
300 305 310 

CTT TTC AAG AAC TCA TTT GTG GTC GAG CGA CAC GCA TGC ATG CCA ACG 1014 
Leu Phe Lys Asn Ser Phe Val Val Glu Arg His Ala Cys Met Pro Thr 
315 320 325 

CAC CCT CAG AGG CCG ATG GTA CTT AAA ACC CTC ATT CAG TTC ACT GTA 1062 
His Pro Gin Arg Pro Met Val Leu Lys Thr Leu He Gin Phe Thr Val 
330 335 340 



AAA CTG AGA TTA CTA ATA AAA TTG CCG GAA CTA AAC TAT CAG GTG AAA mo 
Lys Leu Arg Leu Leu lie Lys Leu Pro Glu Leu Asn Tyr Gin Val Lys 
345 350 355 

GTA AAG GCG TCC ATT GAG AAG AAT GTT TCA ACT CTA AGC AAT AGA AGA 1158 
Val Lys Ala Ser lie Asp Lys Asn Val Ser Thr Leu Ser Asn Arg Arg 
360 365 370 375 

TTT GTG CTT TGT GGA ACT CAC GTC AAA GCT ATG TCC AGT GAG GAA TCT 1206 
Phe Val Leu Cys Gly Thr His Val Lys Ala Met Ser Ser Glu Glu Ser 
380 385 390 

TCC AAT GGG AGC CTC TCA GTG GAG TTA GAC ATT GCA ACC CAA GGA GAT 1254 
Ser Asn Gly Ser Leu Ser Val Glu Leu Asp He Ala Thr Gin Gly Asp 
395 400 405 

GAA GTG CAG TAC TGG AGT AAA GGA AAC GAG GGC TGC CAC ATG GTG ACA 13 02 

Glu Val Gin Tyr Trp Ser Lys Gly Asn Glu Gly Cys His Met Val Thr 
410 415 . 420 

GAG GAG TTG CAT TCC ATA ACC TTT GAG ACC CAG ATC TGC CTC TAT GGC 1350 
Glu Glu Leu His Ser He Thr Phe Glu Thr Gin He Cys Leu Tyr Gly 
425 430 435 

CTC ACC ATT AAC CTA GAG ACC AGC TCA TTA CCT GTC GTG ATG ATT TCT 1398 
Leu Thr He Asn Leu Glu Thr Ser Ser Leu Pro Val Val Met He Ser 
440 445 450 455 

AAT GTC AGC CAA CTA CCT AAT GCA TGG GCA TCC ATC ATT TGG TAC AAT 144 6 

Asn Val Ser Gin Leu Pro Asn Ala Trp Ala Ser He He Trp Tyr Asn 
460 465 470 

GTA TCA ACT AAC GAC TCC CAG AAC TTG GTT TTC TTT AAT AAC CCT CCA 14 94 

Val Ser Thr Asn Asp Ser Gin Asn Leu Val Phe Phe Asn Asn Pro Pro 
475 480 485 

TCT GTC ACT TTG GGC CAA CTC CTG GAA GTG ATG AGC TGG CAA TTT TCA 1542 
Ser Val Thr Leu Gly Gin Leu Leu Glu Val Met Ser Trp Gin Phe Ser 
490 495 500 

TCC TAT GTC GGT CGT GGC CTT AAT TCA GAG CAG CTC AAC ATG CTG GCA 159 0 

Ser Tyr Val Gly Arg Gly Leu Asn Ser Glu Gin Leu Asn Met Leu Ala 
505 510 515 

GAG AAG CTC ACA GTT CAG TCT AAC TAC AAT GAT GGT CAC CTC ACC TGG 16 3 8 

Glu Lys Leu Thr Val Gin Ser Asn Tyr Asn Asp Gly His Leu Thr Trp 
520 525 530 535 

GCC AAG TTC TGC AAG GAA CAT TTG CCT GGC AAA ACA TTT ACC TTC TGG 168 6 

Ala Lys Phe Cys Lys Glu His Leu Pro Gly Lys Thr Phe Thr Phe Trp 
540 545 550 

ACT TGG CTT GAA GCA ATA TTG GAC CTA ATT AAA AAA CAT ATT CTT CCC 1734 
Thr Trp Leu Glu Ala He Leu Asp Leu He Lys Lys His He Leu Pro 
555 560 \ 565 

CTC TGG ATT GAT GGG TAC ATC ATG GGA TTT GTT AGT AAA GAG AAG GAA 1782 
Leu Trp He Asp Gly Tyr He Met Gly Phe Val Ser Lys Glu Lys Glu 
570 575 580 

CGG CTT CTG CTC AAA GAT AAA ATG CCT GGG ACA TTT TTG TTA AGA TTC 18 3 0 

Arg Leu Leu Leu Lys Asp Lys Met Pro Gly Thr Phe Leu Leu Arg Phe 
585 590 595 

AGT GAG AGC CAT CTT GGA GGG ATA ACC TTC ACC TGG GTG GAC CAA TCT 1878 
Ser Glu Ser His Leu Gly Gly He Thr Phe Thr Trp Val Asp Gin Ser 
600 605 610 615 



GAA AAT GGA GAA GTG AG A TTC CAC TCT GTA GAA CCC TAG AAC AAA GGG 1926 
Glu Asn Gly Glu Val Arg Phe His Ser Val Glu Pro Tyr Asn Lys Gly 
620 625 630 

AGA CTG TCG GCT CTG GCC TTC GOT GAG ATC CTG GGA GAG TAG AAG GTT 1974 
Arg Leu Ser Ala Leu Ala Phe Ala Asp He Leu Arg Asp Tyr Lys Val 
635 640 645 

ATC ATG GCT GAA AAC ATC GCT GAA AAC CCT CTG AAG TAC CTC TAG GCT 2022 
He Met Ala Glu Asn lie Pro Glu Asn Pro Leu Lys Tyr Leu Tyr Pro 
650 655 660 

GAC ATT CCC AAA GAC AAA GCC TTT GGC AAA CAC TAC AGC TCC CAG CCG 2 070 

Asp He Pro Lys Asp Lys Ala Phe Gly Lys His Tyr Ser Ser Gin Pro 
665 670 675 

TGC GAA GTC TCA AGA CCA ACC GAA CGG GGA GAC AAG GGT TAC GTC CCC 2118 
Cys Glu Val Ser Arg Pro Thr Glu Arg Gly Asp Lys Gly Tyr Val Pro 
680 685 690 ' 695 

TCT GTT TTT ATC CCC ATT TCA AGA ATC CGA AGC GAT TCC ACG GAG CCA 2166 
Ser Val Phe He Pro He Ser Thr He Arg Ser Asp Ser Thr Glu Pro 
700 705 710 

CAA TCT CCT TCA GAC CTT CTC CCC ATG TCT CCA AGT GCA TAT GCT GTG 2214 
Gin Ser Pro Ser Asp Leu Leu Pro Met Ser Pro Ser Ala Tyr Ala Val 
715 720 725 

CTG AGA GAA AAC CTG AGC CCA ACG AGA ATT GAA ACT GCA ATG AAT TCC 22 62 

Leu Arg Glu Asn Leu Ser Pro Thr Thr He Glu Thr Ala Met Asn Ser 
730 735 740 

CCA TAT TCT GCT GAA TGACGGTGCA AACGGACACT TTAAAGAAGG AAGCAGATGA 2317 
Pro Tyr Ser Ala Glu 
745 

AACTGGAGAG TGTTCTTTAC CATAGATCAC AATTTATTTC TTCGGCTTTG TAAATACC 2375 

(2) INFORMATION FOR SEQ ID NO: 10: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 748 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION; SEQ ID NO: 10: 

Met Ser Gin Trp Asn Gin Val Gin Gin Leu Glu He Lys Phe Leu Glu 
15 10 15 

Gin Val Asp Gin Phe Tyr Asp Asp Asn Phe Pro Met Glu He Arg His 
20 25 30 

Leu Leu Ala Gin Trp He Glu Thr Gin Asp Trp Glu Val Ala Ser Asn 
35 40 45 

Asn Glu Thr Met Ala Thr He Leu Leu Gin Asn Leu Leu He Gin Leu 
50 55 60 

Asp Glu Gin Leu Gly Arg Val Ser Lys Glu Lys Asn Leu Leu Leu He 
65 70 75 80 



His Asn Leu Lys Arg He Arg Lys Val Leu Gin Gly Lys Phe His Gly 
85 90 95 



Asn Pro Met His Val Ala Val Val lie Ser Asn Cys Leu Arg Glu Glu 
100 105 110 

Arg Arg He Leu Ala Ala Ala Asn Met Pro He Gin Gly Pro Leu Glu 
115 120 125 

Lys Ser Leu Gin Ser Ser Ser Val Ser Glu Arg Gin Arg Asn Val Glu 
130 135 140 

His Lys Val Ser Ala He Lys Asn Ser Val Gin Met Thr Glu Gin Asp 
3-45 150 155 IGO 

Thr Lys Tyr Leu Glu Asp Leu Gin Asp Glu Phe Asp Tyr Arg Tyr Lys 
165 170 175 

Thr He Gin Thr Met Asp Gin Gly Asp Lys Asn Ser He Leu Val Asn 
lao 185 190 

Gin Glu Val Leu Thr Leu Leu Gin Glu Met Leu Asn Ser Leu Asp Phe 
195 200 205 

Lys Arg Lys Glu Ala Leu Ser Lys Met Thr Gin He Val Asn Glu Thr 
210 215 220 

Asp Leu Leu Met Asn Ser Met Leu Leu Glu Glu I7eu Gin Asp Trp Lys 
225 230 235 240 

Lys Arg His Arg He Ala Cys He Gly Gly Pro Leu His Asn Gly Leu 
245 250 255 

Asp Gin Leu Gin Asn Cys Phe Thr Leu Leu Ala Glu Ser Leu Phe Gin 
260 265 270 

Leu Arg Gin Gin Leu Glu Lys Leu Gin Glu Gin Ser Thr Lys Met Thr 
275 280 285 

Tyr Glu Gly Asp Pro He Pro Ala Gin Arg Ala His Leu Leu Glu Arg 
290 295 300 

Ala Thr Phe Leu He Tyr Asn Leu Phe Lys Asn Ser Phe Val Val Glu 
305 310 315 320 

Arg His Ala Cys Met Pro Thr His Pro Gin Arg Pro Met Val Leu Lys 
325 330 335 

Thr Leu He Gin Phe Thr Val Lys Leu Arg Leu Leu He Lys Leu Pro 
340 345 350 

Glu Leu Asn Tyr Gin Val Lys Val Lys Ala Ser He Asp Lys Asn Val 
355 360 365 

Ser Thr Leu Ser Asn Arg Arg Phe Val Leu Cys Gly Thr His Val Lys 
370 375 380 

Ala Met Ser Ser Glu Glu Ser Ser Asn Gly Ser Leu Set Val Glu Leu 
385 390 395 400 

Asp He Ala Thr Gin Gly Asp Glu Val Gin Tyr Trp Ser Lys Gly Asn 
405 410 415 

Glu Gly Cys His Met Val Thr Glu Glu Leu His Ser He Thr Phe Glu 
420 425 430 

Thr Gin He Cys Leu Tyr Gly Leu Thr He Asn Leu Glu Thr Ser Ser 
435 440 445 

Leu Pro Val Val Met He Ser Asn Val Ser Gin Leu Pro Asn Ala Trp 
450 455 460 



Ala Ser He He Trp Tyr Asn Val Ser Thr Asn Asp Ser Gin Asn Leu 
465 470 475 480 

Val Phe Phe Asn Asn Pro Pro Ser Val Thr Leu Gly Gin Leu Leu Glu 
485 490 495 

Val Met Ser Trp Gin Phe Ser Ser Tyr Val Gly Arg Gly Leu Asn Ser 
500 505 510 

Glu Gin Leu Asn Met Leu Ala Glu Lys Leu Thr Val Gin Ser Asn Tyr 
515 520 525 

Asn Asp Gly His Leu Thr Trp Ala Lys Phe Cys Lys Glu His Leu Pro 
530 535 540 

Gly Lys Thr Phe Thr Phe Trp Thr Trp Leu Glu Ala He Leu Asp Leu 
545 550 555 560 

He Lys Lys His He Leu Pro Leu Trp He Asp Gly Tyr He Met Gly 
565 570 575 

Phe Val Ser Lys Glu Lys Glu Arg Leu Leu Leu Lys Asp Lys Met Pro 
580 585 590 

aiy Thr Phe Leu Leu Arg Phe Ser Glu Ser His Leu Gly Gly He Thr 
595 600 605 

Phe Thr Trp Val Asp Gin Ser Glu Asn Gly Glu Val Arg Phe His Ser 
610 615 620 

Val Glu Pro Tyr Asn Lys Gly Arg Leu Ser Ala Leu Ala Phe Ala Asp 
625 630 635 640 

He Leu Arg Asp Tyr Lys Val He Met Ala Glu Asn He Pro Glu Asn 
645 650 655 

Pro Leu Lys Tyr Leu Tyr Pro Asp He Pro Lys Asp Lys Ala Phe Gly 
660 665 670 

Lys His Tyr Ser Ser Gin Pro Cys Glu Val Ser Arg Pro Thr Glu Arg 
675 680 685 

Gly Asp Lys Gly Tyr Val Pro Ser Val Phe He Pro He Ser Thr He 
690 695 700 

Arg Ser Asp Ser Thr Glu Pro Gin Ser Pro Ser Asp Leu Leu Pro Met 
705 710 715 720 

Ser Pro Ser Ala Tyr Ala Val Leu Arg Glu Asn Leu Ser Pro Thr Thr 
725 730 735 

He Glu Thr Ala Met Asn Ser Pro Tyr Ser Ala Glu 
740 745 

(2) INFORMATION FOR SEQ ID NO: 11: i 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 86 9 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : both 

(D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTI- SENSE: NO 

(vi) ORIGINAL SOURCE: 



(A) ORGANISM: Mouse 



(vii) IMMEDIATE SOURCE: 

(A) LIBRARY: splenic/thymic 

(B) CLONE: Murine, 19sf6 

(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 69.. 2378 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 

GCCGCGACCA GCCAGGCCGG CCAGTCGGGC TCAGCCCGGA GACAGTCGAG ACCCCTGACT 6 0 

GCAGCAGG ATG GCT CAG TGG AAC CAG 'cTG CAG CAG CTG GAC ACA CGC TAC 110 
Met Ala Gin Trp Asn Gin Leu Gin Gin Leu Asp Thr Arg Tyr 
15 10 

CTG AAG CAG CTG CAC CAG CTG TAC AGC GAC ACG TTC CCC ATG GAG CTG 158 
Leu Lys Gin Leu His Gin Leu Tyr Ser Asp Thr Phe Pro Met Glu Leu 
15 20 25 30 

CGG CAG TTC CTG GCA CCT TGG ATT GAG AGT CAA GAC TGG GCA TAT GCA 206 
Arg Gin Phe Leu Ala Pro Trp lie Glu Ser Gin Asp Trp Ala Tyr Ala 
35 40 45 

GCC AGC AAA GAG TCA CAT GCC ACG TTG GTG TTT CAT AAT CTC TTG GGT 254 
Ala Ser Lys Glu Ser His Ala Thr Leu Val Phe His Asn Leu Leu Gly 
50 55 60 

GAA ATT GAC CAG CAA TAT AGC CGA TTC CTG CAA GAG TCC AAT GTC CTC 302 
Glu He Asp Gin Gin Tyr Ser Arg Phe Leu Gin Glu Ser Asn Val Leu 
65 70 75 

TAT CAG CAC AAC CTT CGA AGA ATC AAG CAG TTT CTG CAG AGC AGG TAT 350 
Tyr Gin His Asn Leu Arg Arg He Lys Gin Phe Leu Gin Ser Arg Tyr 
80 85 90 

CTT GAG AAG CCA ATG GAA ATT GCC CGG ATC GTG GCC CGA TGC CTG TGG 3 98 

Leu Glu Lys Pro Met Glu He Ala Arg He Val Ala Arg Cys Leu Trp 
95 100 105 110 

GAA GAG TCT CGC CTC CTC CAG ACG GCA GCC ACG GCA GCC CAG CAA GGG 44 6 

Glu Glu Ser Arg Leu Leu Gin Thr Ala Ala Thr Ala Ala Gin Gin Gly 
115 120 125 

GGC CAG GCC AAC CAC CCA ACA GCC GCC GTA GTG ACA GAG AAG CAG CAG 4 94 

Gly Gin Ala Asn His Pro Thr Ala Ala Val Val Thr Glu Lys Gin Gin 
130 135 140 

ATG TTG GAG CAG CAT CTT CAG GAT GTC CGG AAG CGA GTG CAG GAT CTA 542 
Met Leu Glu Gin His Leu Gin Asp Val Arg Lys Arg Val Gin Asp Leu 
145 150 155^. 

GAA CAG AAA ATG AAG GTG GTG GAG AAC CTC CAG GAC GAC TTT GAT TTC 590 
Glu Gin Lys Met Lys Val Val Glu Asn Leu Gin Asp Asp Phe Asp Phe 
160 165 170 

AAC TAC AAA ACC CTC AAG AGC CAA GGA GAC ATG CAG GAT CTG AAT GGA 638 
Asn Tyr Lys Thr Leu Lys Ser Gin Gly Asp Met Gin Asp Leu Asn Gly 
175 180 185 190 

AAC AAC CAG TCT GTG ACC AGA CAG AAG ATG CAG CAG CTG GAA CAG ATG 68 6 

Asn Asn Gin Ser Val Thr Arg Gin Lys Met Gin Gin Leu Glu Gin Met 
195 200 205 



CTC ACA GCC CTG GAC CAG ATG CGG AGA AGC ATT GTG AGT GAG CTG GCG 734 
Leu Thr Ala Leu Asp Gin Met Arg Arg Ser lie Val Ser Glu Leu Ala 
210 215 220 

GGG CTC TTG TCA GCA ATG GAG TAG GTG CAG AAG ACA CTG ACT GAT GAA 732 
Gly Leu Leu Ser Ala Met Glu Tyr Val Gin Lys Thr Leu Thr Asp Glu 
225 230 235 

GAG CTG GCT GAC TGG AAG AGG CGG CCA GAG ATC GCG TGC ATC GGA GGC 83 0 

Glu Leu Ala Asp Trp Lys Arg Arg Pro Glu lie Ala Cys He Gly Gly 
240 245 250 

CCT CCC AAC ATC TGC CTG GAC CGT CTG GAA AAC TGG ATA ACT TCA TTA 878 
Pro Pro Asn He Cys Leu Asp Arg Leu Glu Asn Trp He Thr Ser Leu 
255 260 265 270 

GCA GAA TCT CAA CTT CAG ACC CGC CAA CAA ATT AAG AAA CTG GAG GAG 926 
Ala Glu Ser Gin Leu Gin Thr Arg Gin Gin lie Lys Lys Leu Glu Glu 
275 280 " 285 

CTG CAG CAG AAA GTG TCC TAG AAG GGC GAC CCT ATC GTG CAG CAC CGG 974 
Leu Gin Gin Lys Val Ser Tyr Lys Gly Asp Pro He Val Gin His Arg 
290 295 300 

CCC ATG CTG GAG GAG AGG ATC GTG GAG CTG TTC AGA AAC TTA ATG AAG" 1022 
Pro Met Leu Glu Glu Arg He Val Glu Leu Phe Arg Asn Leu Met Lys 
305 310 315 

AGT GCC TTC GTG GTG GAG CGG CAG CCC TGC ATG CCC ATG CAC CCG GAC 1070 
Ser Ala Phe Val Val Glu Arg Gin Pro Cys Met Pro Met His Pro Asp 
320 325 330 

CGG CCC TTA GTC ATC AAG ACT GGT GTC CAG TTT ACC ACG AAA GTC AGG 1113 
Arg Pro Leu Val He Lys Thr Gly Val Gin Phe Thr Thr Lys Val Arg 
335 340 345 350 

TTG CTG GTC AAA TTT CCT GAG TTG AAT TAT CAG CTT AAA ATT AAA GTG 1166 
Leu Leu Val Lys Phe Pro Glu Leu Asn Tyr Gin Leu Lys He Lys Val 
355 360 365 

TGC ATT GAT AAA GAC TCT GGG GAT GTT GCT GCC CTC AGA GGG TCT CGG 1214 
Cys He Asp Lys Asp Ser Gly Asp Val Ala Ala Leu Arg Gly Ser Arg 
370 375 380 

AAA TTT AAC ATT CTG GGC ACG AAC ACA AAA GTG ATG AAC ATG GAG GAG 1262 
Lys Phe Asn He Leu Gly Thr Asn Thr Lys Val Met Asn Met Glu Glu 
385 390 395 

TCT AAC AAC GGC AGC CTG TCT GCA GAG TTC AAG CAC CTG ACC CTT AGG 1310 
Ser Asn Asn Gly Ser Leu Ser Ala Glu Phe Lys His Leu Thr Leu Arg 
400 405 410 

GAG CAG AGA TGT GGG AAT GGA GGC CGT GCC AAT TGT GAT GCC TCC TTG 1358 
Glu Gin Arg Cys Gly Asn Gly Gly Arg Ala Asn Cys Asp Ala Ser Leu 
415 420 425 ' 430 

ATC GTG ACT GAG GAG CTG CAC CTG ATC ACC TTC GAG ACT GAG GTG TAC 14 06 

He Val Thr Glu Glu Leu His Leu He Thr Phe Glu Thr Glu Val Tyr 
435 440 445 

CAC CAA GGC CTC AAG ATT GAC CTA GAG ACC CAC TCC TTG CCA GTT GTG 1454 
His Gin Gly Leu Lys He Asp Leu Glu Thr His Ser Leu Pro Val Val 
450 455 460 

GTG ATC TCC AAC ATC TGT CAG ATG CCA AAT GCT TGG GCA TCA ATC CTG 1502 
Val He Ser Asn He Cys Gin Met Pro Asn Ala Trp Ala Ser He Leu 
465 470 475 



TGG TAT AAC ATG CTG ACC AAT AAC CCC AAG AAC GTG AAC TTC TTC ACT 1550 
Trp Tyr Asn Met Leu Thr Asn Asn Pro Lys Asn Val Asn Phe Phe Thr 
480 485 490 

AAG CCG CCA ATT GGA ACC TGG GAC CAA GTG GCC GAG GTG CTC AGC TGG 1598 
Lys Pro Pro lie Gly Thr Trp Asp Gin Val Ala Glu Val Leu Ser Trp 
495 500 505 510 

CAG TTC TCG TCC ACC ACC AAG CGA GGG CTG AGC ATC GAG GAG CTG ACA 164 6 

Gin Phe Ser Ser Thr Thr Lys Arg Gly Leu Ser lie Glu Gin Leu Thr 
515 520 525 

ACG CTG GCT GAG AAG CTC CTA GGG CCT GGT GTG AAC TAG TCA GGG TGT 1694 
Thr Leu Ala Glu Lys Leu Leu Gly Pro Gly Val Asn Tyr Ser Gly Cys 
530 535 540 

CAG ATC ACA TGG GCT AAA TTC TGC AAA GAA AAC ATG GCT GGC AAG GGC 1742 
Gin lie Thr Trp Ala Lys Phe Cys Lys Glu Asn Met Ala Gly Lys Gly 
545 550 ' 555 

TTC TCC TTC TGG GTC TGG CTA GAC AAT ATC ATC GAC CTT GTG AAA AAG 1790 
Phe Ser Phe Trp Val Trp Leu Asp Asn lie lie Asp Leu Val Lys Lys 
560 565 570 

TAT ATC TTG GCC CTT TGG AAT GAA GGG TAC ATC ATG GGT TTC ATC AGC 1838 
Tyr lie Leu Ala Leu Trp Asn Glu Gly Tyr lie Met Gly Phe lie Ser 
575 580 585 590 

AAG GAG CGG GAG CGG GCC ATC CTA AGC ACA AAG CCC CCG GGC ACC TTC 18 8 6 

Lys Glu Arg Glu Arg Ala lie Leu Ser Thr Lys Pro Pro Gly Thr Phe 
595 600 605 

CTA CTG CGC TTC AGC GAG AGC AGC AAA GAA GGA GGG GTC ACT TTC ACT 1934 
Leu Leu Arg Phe Ser Glu Ser Ser Lys Glu Gly Gly Val Thr Phe Thr 
610 615 620 

TGG GTG GAA AAG GAC ATC AGT GGC AAG ACC CAG ATC CAG TCT GTA GAG 198 2 

Trp Val Glu Lys Asp lie Ser Gly Lys Thr Gin lie Gin Ser Val Glu 
625 630 635 

CCA TAC ACC AAG CAG CAG CTG AAC AAC ATG TCA TTT GCT GAA ATC ATC 203 0 

Pro Tyr Thr Lys Gin Gin Leu Asn Asn Met Ser Phe Ala Glu lie lie 
640 645 650 

ATG GGC TAT AAG ATC ATG GAT GCG ACC AAC ATC CTG GTG TCT CCA CTT 2078 
Met Gly Tyr Lys lie Met Asp Ala Thr Asn lie Leu Val Ser Pro Leu 
655 660 665 670 

GTC TAC CTC TAC CCC GAC ATT CCC AAG GAG GAG GCA TTT GGA AAG TAC 2126 
Val Tyr Leu Tyr Pro Asp lie Pro Lys Glu Glu Ala Phe Gly Lys Tyr - 
675 680 685 

TGT AGG CCC GAG AGC CAG GAG CAC CCC GAA GCC GAC CCA GGT AGT GCT 2174 
Cys Arg Pro Glu Ser Gin Glu His Pro Glu Ala Asp Pro Gly Ser Ala 
690 695 \ 700 

GCC CCG TAC CTG AAG ACC AAG TTC ATC TGT GTG ACA CCA ACG ACC TGC 2222 
Ala Pro Tyr Leu Lys Thr Lys Phe lie Cys Val Thr Pro Thr Thr Cys 
705 710 715 

AGC AAT ACC ATT GAC CTG CCG ATG TCC CCC CGC ACT TTA GAT TCA TTG 22 70 

Ser Asn Thr lie Asp Leu Pro Met Ser Pro Arg Thr Leu Asp Ser Leu 
720 725 730 

ATG CAG TTT GGA AAT AAC GGT GAA GGT GCT GAG CCC TCA GCA GGA GGG 2318 
Met Gin Phe Gly Asn Asn Gly Glu Gly Ala Glu Pro Ser Ala Gly Gly 
735 740 745 750 



CAG TTT GAG TCG CTC ACG TTT GAG ATG GAT CTG ACC TCG GAG TGT GCT 2366 
Gin Phe Glu Ser Leu Thr Phe Asp Met Asp Leu Thr Ser Glu Cys Ala 
755 760 765 

ACC TCC CCC ATG TGAGGAGCTG AAACCAGAAG CTGCAGAGAC GTGACTTGAG 2418 
Thr Ser Pro Met 

770 

ACACCTGCCC CGTGCTCCAC CCCTAAGCAG CCGAACCCCA TATCGTCTGA AACTCCTAAC 2478 

TTTGTGGTTC CAGATTTTTT TTTTTAATTT CCTACTTCTG CTATCTTTGG GCAATCTGGG 2538 

CACTTTTTAA AAGAGAGAAA TGAGTGAGTG TGGGTGATAA ACTGTTATGT AAAGAGGAGA 25 98 

GACCTCTGAG TCTGGGGATG GGGCTGAGAG CAGAAGGGAG GCAAAGGGGA ACACCTCCTG 2658 

TCCTGCCCGC CTGCCCTCCT TTTTCAGCAG CTCGGGGGTT GGTTGTTAGA CAAGTGCCTC 2718 

CTGGTGCCCA TGGCTACCTG TTGCCCCACT CTGTGAGCTG ATACCCCATT CTGGGAACTC 2778 

CTGGCTCTGC ACTTTCAACC TTGCTAATAT CCACATAGAA GCTAGGACTA AGCCCAGGAG 2836 

GTTCCTCTTT AAATTAAAAA AAAAAAAAAA A 2869 

(2) INFORMATION FOR SEQ ID NO: 12: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 770 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 

Met Ala Gin Trp Asn Gin Leu Gin Gin Leu Asp Thr Arg Tyr Leu Lys 
15 10 15 

Gin Leu His Gin Leu Tyr Ser Asp Thr Phe Pro Met Glu Leu Arg Gin 
20 25 30 

Phe Leu Ala Pro Trp lie Glu Ser Gin Asp Trp Ala Tyr Ala Ala Ser 
35 40 45 

Lys Glu Ser His Ala Thr Leu Val Phe His Asn Leu Leu Gly Glu lie 
50 55 60 

Asp Gin Gin Tyr Ser Arg Phe Leu Gin Glu Ser Asn Val Leu Tyr Gin 
65 70 75 80 

His Asn Leu Arg Arg lie Lys Gin Phe Leu Gin Ser Arg Tyr Leu Glu 
85 90 95 

Lys Pro Met Glu lie Ala Arg lie Val Ala Arg Cys Leu 'Trp Glu Glu 
100 105 110 

Ser Arg Leu Leu Gin Thr Ala Ala Thr Ala Ala Gin Gin Gly Gly Gin 
115 120 125 

Ala Asn His Pro Thr Ala Ala Val Val Thr Glu Lys Gin Gin Met Leu 
130 135 140 

Glu Gin His Leu Gin Asp Val Arg Lys Arg Val Gin Asp Leu Glu Gin 
145 150 155 160 

Lys Met Lys Val Val Glu Asn Leu Gin Asp Asp Phe Asp Phe Asn Tyr 
165 170 175 



Lys Thr Leu Lys Ser Gin Gly Asp Met Gin Asp Leu Asn Gly Asn Asn 
180 185 190 

Gin Ser Val Thr Arg Gin Lys Met Gin Gin Leu Glu Gin Met Leu Thr 
195 200 205 

Ala Leu Asp Gin Met Arg Arg Ser lie Val Ser Glu Leu Ala Gly Leu 
210 215 220 

Leu Ser Ala Met Glu Tyr Val Gin Lys Thr Leu Thr Asp Glu Glu Leu 
225 230 235 240 

Ala Asp Trp Lys Arg Arg Pro Glu lie Ala Cys lie Gly Gly Pro Pro 
245 250 255 

Asn lie Cys Leu Asp Arg Leu Glu Asn Trp lie Thr Ser Leu Ala Glu 
260 265 270 

Ser Gin Leu Gin Thr Arg Gin Gin lie Lys Lys Leu Glu Glu Leu Gin 
275 280 285 

Gin Lys Val Ser Tyr Lys Gly Asp Pro lie Val Gin His Arg Pro Met 
290 295 300 

Leu Glu Glu Arg lie Val Glu Leu Phe Arg Asn L5u Met Lys Ser Ala 
305 310 315 320 

Phe Val Val Glu Arg Gin Pro Cys Met Pro Met His Pro Asp Arg Pro 
325 330 335 

Leu Val lie Lys Thr Gly Val Gin Phe Thr Thr Lys Val Arg Leu Leu 
340 345 350 

Val Lys Phe Pro Glu Leu Asn Tyr Gin Leu Lys lie Lys Val Cys lie 
355 360 365 

Asp Lys Asp Ser Gly Asp Val Ala Ala Leu Arg Gly Ser Arg Lys Phe 
370 375 380 

Asn lie Leu Gly Thr Asn Thr Lys Val Met Asn Met Glu Glu Ser Asn 
385 390 395 400 

Asn Gly Ser Leu Ser Ala Glu Phe Lys His Leu Thr Leu Arg Glu Gin 
405 410 415 

Arg Cys Gly Asn Gly Gly Arg Ala Asn Cys Asp Ala Ser Leu lie Val 
420 425 430 

Thr Glu Glu Leu His Leu He Thr Phe Glu Thr Glu Val Tyr His Gin 
435 440 445 

Gly Leu Lys He Asp Leu Glu Thr His Ser Leu Pro Val Val Val He 
450 455 460 

Ser Asn He Cys Gin Met Pro Asn Ala Trp Ala Ser lie Leu Trp Tyr 
465 470 475 480 

Asn Met Leu Thr Asn Asn Pro Lys Asn Val Asn Phe Phe Thr Lys Pro 
485 490 495 

Pro He Gly Thr Trp Asp Gin Val Ala Glu Val Leu Ser Trp Gin Phe 
500 505 510 

Ser Ser Thr Thr Lys Arg Gly Leu Ser He Glu Gin Leu Thr Thr Leu 
515 520 525 

Ala Glu Lys Leu Leu Gly Pro Gly Val Asn Tyr Ser Gly Cys Gin He 
530 535 540 



Thr Trp Ala Lys Phe Cys Lys Glu Asn Met Ala Gly Lys Gly Phe Ser 
545 550 555 560 

Phe Trp Val Trp Leu Asp Asn lie lie Asp Leu Val Lys Lys Tyr lie 
565 570 575 

Leu Ala Leu Trp Asn Glu Gly Tyr lie Met Gly Phe lie Ser Lys Glu 
580 585 590 

Arg Glu Arg Ala lie Leu Ser Thr Lys Pro Pro Gly Thr Phe Leu Leu 
595 600 605 

Arg Phe Ser Glu Ser Ser Lys Glu Gly Gly Val Thr Phe Thr Trp Val 
610 615 620 

Glu Lys Asp lie Ser Gly Lys Thr Gin lie Gin Ser Val Glu Pro Tyr 
625 630 635 640 

Thr Lys Gin Gin Leu Asn Asn Met Ser Phe Ala GLu lie lie Met Gly 
645 650 655 

Tyr Lys lie Met Asp Ala Thr Asn lie Leu Val Ser Pro Leu Val Tyr 
660 665 670 

Leu Tyr Pro Asp lie Pro Lys Glu Glu Ala Phe Giy Lys Tyr Cys Arg 
675 680 685 

Pro Glu Ser Gin Glu His Pro Glu Ala Asp Pro Gly Ser Ala Ala Pro 
690 695 700 

Tyr Leu Lys Thr Lys Phe lie Cys Val Thr Pro Thr Thr Cys Ser Asn 
705 710 715 720 

Thr lie Asp Leu Pro Met Ser Pro Arg Thr Leu Asp Ser Leu Met Gin 
725 730 735 

Phe Gly Asn Asn Gly Glu Gly Ala Glu Pro Ser Ala Gly Gly Gin Phe 
740 745 750 

Glu Ser Leu Thr Phe Asp Met Asp Leu Thr Ser Glu Cys Ala Thr Ser 
755 760 765 

Pro Met 
770 

(2) INFORMATION FOR SEQ ID NO: 13: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: CDNA 

\ 

(iii) HYPOTHETICAL: NO 

(iv) ANTI -SENSE: NO 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 



(xi) SEQUENCE DESCRIPTION: SEQ ID N0:13: 
AAYACNGARC CNATGGARAT YATT 
(2) INFORMATION FOR SEQ ID NO: 14: 



24 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(iij MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTI -SENSE: NO 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14 
AAYGTNGAYC ARYTNAAYAT G 
(2) INFORMATION FOR SEQ ID NO: 15: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTI -SENSE: NO 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15 
RTCDATRTTN GRGTANAR 
(2) INFORMATION FOR SEQ ID NO: 16: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTI -SENSE: NO 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16 
GTAYAANTYR AYCAGNGYAA 
(2) INFORMATION FOR SEQ ID NO: 17: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 5 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
(iii) HYPOTHETICAL: NO 
(iv) ANTI -SENSE: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17: 
GATCGAGATG TATTTCCCAG AAAAG 
(2) INFORMATION FOR SEQ ID NO: 18: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH; 15 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDfTESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 

(v) FRAGMENT TYPE: internal 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 18: 

Leu Asp Gly Pro Lys Gly Thr Gly Tyr lie Lys Thr Glu Leu lie 
15 10 15 

(2) INFORMATION FOR SEQ ID NO: 19: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 6 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(iii) HYPOTHETICAL: NO 

(iv) ANTI -SENSE: NO ' 

(v) FRAGMENT TYPE: internal 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 19; 

Gly Tyr lie Lys Thr Glu 
1 5 

(2) INFORMATION FOR SEQ ID NO: 20: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 14 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 
(iii) HYPOTHETICAL: NO 
(iv) ANTI- SENSE: NO 
(v) FRAGMENT TYPE: internal 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 20: 

Lys Val Asn Leu Gin Glu Arg Arg Lys Tyr Leu Lys His Arg 
15 10 

(2) INFORMATION FOR SEQ ID NO: 21: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 11 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(iii) HYPOTHETICAL: NO 

(iv) ANTI -SENSE: NO 

(v) FRAGMENT TYPE: internal 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 21: 

Glu Pro Gin Tyr Glu Glu lie Pro lie Tyr Leu 
15 10 

(2) INFORMATION FOR SEQ ID NO: 22: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 105 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(iii) HYPOTHETICAL: NO 

(iv) ANTI -SENSE: NO 

(v) FRAGMENT TYPE: internal 



(vii) IMMEDIATE SOURCE: 
(B) CLONE: Src 



(x) PUBLICATION INFORMATION: 

(A) AUTHORS: Waksman, et al . 



(C) JOURNAL: Nature 

(D) VOLUME: 358 

(F) PAGES: 646-653 

(G) DATE: 1992 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 22: 

Ala Glu Glu Trp Tyr Phe Gly Lys lie Thr Arg Arg Glu Ser Glu Arg 
15 10 15 

Leu Leu Leu Asn Pro Glu Asn Pro Arg Gly Thr Phe Leu Val Arg Glu 
20 25 30 

Ser Glu Thr Thr Lys Gly Ala Tyr Cys Leu Ser Val Ser Asp Phe Phe 
35 40 45 

Asp Asn Ala Lys Gly Leu Asn Val Lys His Tyr Lys lie Arg Lys Leu 
50 55 60 

Asp Ser Gly Gly Phe Tyr lie Thr Ser Arg Thr Gin Phe Ser Ser Leu 
65 70 75 80 

Gin Gin Leu Val Ala Tyr Tyr Ser Lys His Ala Asp Gly Leu Cys His 
85 90 95 

Arg Leu Thr Asn Val Cys Pro Thr Ser 
100 105 

) INFORMATION FOR SEQ ID NO:23: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 9 9 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

Cii) MOLECULE TYPE: peptide 

(iii) HYPOTHETICAL: NO 

(iv) ANTI -SENSE: NO 

(v) FRAGMENT TYPE: internal 

(vii) IMMEDIATE SOURCE: 

(B) CLONE: Abl 

(x) PUBLICATION INFORMATION: 

(A) AUTHORS: Overduin, et al . 

(C) JOURNAL: Proc . Natl. Acad. Sci. U.S.A. 

(D) VOLUME: 8 9 

(F) PAGES: 11673-11677 

(G) DATE: 1992 \ 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 23: 

Glu Lys His Ser Trp Tyr His Gly Pro Val Ser Arg Asn Ala Ala Glu 
15 10 15 

Tyr Leu Leu Ser Ser Gly lie Asn Gly Ser Phe Leu Val Arg Glu Ser 
20 25 30 

Asp Arg Arg Pro Gly Gin Arg Ser lie Ser Leu Arg Tyr Glu Glu Gly 
35 40 45 

Arg Val Tyr His Tyr Arg lie Asn Thr Ala Ser Asp Gly Lys Leu Tyr 
50 55 60 



Val Ser Ser Glu Ser Arg Phe Asn Thr Leu Ala Glu Leu Val His His 
65 70 75 80 

His Ser Thr Val Ala Asp Gly Leu lie Thr Thr Leu His Tyr Pro Ala 
85 90 95 

Pro Lys Arg 



(2) INFORMATION FOR SEQ ID NO: 24: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 102 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(iii) HYPOTHETICAL: NO 

(iv) ANTI- SENSE: NO 

(v) FRAGMENT TYPE: internal 



(vii) IMMEDIATE SOURCE: 
(B) CLONE: Lck 



(x) PUBLICATION INFORMATION: 
(A) AUTHORS: Eck, et al . 

(C) JOURNAL: Nature 

(D) VOLUME: 3 62 

(F) PAGES: 87-91 

(G) DATE: 1993 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 24: 

Trp Phe Phe Lys Asa Leu Ser Arg Lys Asp Ala Glu Arg Gin Leu Leu 
15 10 15 

Ala Pro Gly Asn Thr His Gly Ser Phe Leu lie Arg Glu Ser Glu Ser 
20 25 30 

Thr Ala Gly Ser Phe Ser Leu Ser Val Arg Asp Asp Phe Asp Gin Asn 
35 40 45 

Gin Gly Glu Val Val Lys His Tyr Lys lie Arg Asn Leu Asp Asn Gly 
50 55 60 

Gly Phe Tyr lie Ser Pro Arg lie Thr Phe Pro Gly Leu His Asp Leu 
65 70 75 80 

Val Arg His Tyr Thr Asn Ala Ser Asp Gly Leu Cyp Thr Arg Leu Ser 
85 90 95 

Arg Pro Cys Gin Thr Gin 
100 

(2) INFORMATION FOR SEQ ID NO: 25: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 99 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: peptide 



Uii) HYPOTHETICAL: NO 
(iv) ANTI- SENSE: NO 
(v) FRAGMENT TYPE: internal 

(vii) IMMEDIATE SOURCE: 

(B) CLONE: p85 [alpha] N 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 25: 

Gin Asp Ala Glu Trp Tyr Trp Gly Asp He Ser Arg Glu Glu Val Asn 
15 10 ^5 

Glu Lys Leu Arg Asp Thr Ala Asp Gly Thr Phe Leu Val Arg Asp Ala 
20 25 . 30 

Ser Thr Lys Met His Gly Asp Tyr Thr Leu Thr Leu Arg Lys Gly Gly 
35 40 45 

Asn Asn Lys Leu He Lys He Phe His Arg Asp Gly Lys Tyr Gly Phe 
- 50 "55 - 60 

Ser ASD Pro Leu Thr Phe Asn Ser Val Val Glu Leu He Asn His Tyr 
65 70 75 80 

Arg His Glu Ser Leu Ala Gin Tyr Asn Pro Lys Leu Asp Val Lys Leu 
85 90 95 



Leu Tyr Pro 
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